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(57) Abstract 

The invention relates to methods and reagents for the diagnosis and treatment of a disease caused by or associated with an RNA 
molecule having a transcript mutation giving rise to a frameshift mutation. The diagnostic methods include the steps of providing a body 
fluid or tissue sample from a patient; and analyzing the sample for the presence of an RNA molecule having a frameshift mutation or a 
protein encoded thereby, wherein the presence of the mutated KKA molecule or encoded protein is indicative of the disease. The therapeutic 
treatments include administering substances which selectively eliminate mutate RNA molecule from the cell. 
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DIAGNOSIS METHOD AND REAGENTS 

BACKGROUND OF THE INVENTION 
The invention encompasses methods and reagents for the diagnosis of a 
disease caused by or associated with a transcript mutation giving rise to a frameshift 
mutation within an RNA molecule. The methods include the steps of providing a 
body fluid or tissue sample from a patient; and analyzing the sample for the presence 
of an RNA molecule having a frameshift mutation of a protein encoded thereby, 
wherein the presence of the mutated RNA or encoded protein is indicative of the 
disease. 

It is an object of the present invention to provide methods and assays for 
detection and/or treatment of diseases involving transcript mutations, particularly 
those diseases relating to aging, wherein the probability of having the disease increases 
with the age of the patient. The invention contemplates detection and/or treatment of 
those age-related diseases which are due to mutations occurring in the RNA of cells. 
If the mutations are not corrected, the disease may result. 

Another object of the invention is to treat diseases identified according to the 
invention, by providing to a patient afflicted with the disease or having a propensity to 
develop the disease, a corrective agent such as an enzyme or oligonucleotide. 

Yet another object of the invention is to provide a method for identifying 
age-related diseases by correlating nucleotide sequence mutation hotspots with the 
disease. - : 

Other objects of the invention relate to identification, detection and treatment 
of age-related diseases including cancers (especially non-hereditary cancers) and 
neurodegenerative diseases, such as Alzheimer's Disease (AD), Downs' syndrome, 
frontal lobe dementia (Pick's Disease), progressive supranuclear palsy (PSP) and other 
diseases with abundant tau-positive filamentous lesions (such as Corticobasal 
degeneration, Dementia pugilistica, Dementia with tangles only, Dementia with 
tangles and calcification, Frontotemporal dementias with Parkinsonism linked to 
chromosome 17, Gertsmann-Strassler-Scheinker disease with tangles, Myotonic 
dystrophy, Niemann-Pick disease type C, Parkinsonism-dementia complex of Guam, 
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Postencephalitic Parkinsonism and Subacute sclerosing panencephalitis), Parkinson's 
Disease (PD), amyotrophic lateral sclerosis, Huntington's Disease, multiple sclerosis, 
dementia with Lewy bodies, multisystem atrophy, other inclusion body diseases 
associated with ubiquitin (such as Alexander's disease, Alcoholic liver disease, lichen 
amyloidosis, and during aging Marinesco bodies and Hyaline inclusions), diabetes 
mellitus type II and other degenerative diseases, such as cardiovascular diseases and 
rheumatoid arthritis. Early disease diagnosis is important for effective treatment. 

Alzheimer's Disease is in most cases a disease which is related to aging. AD 
is characterized by atrophy of nerve cells in the cerebral cortex, subcortical areas, and 
hippocampus and the presence of plaques, dystrophic neurites, neuropil threads and 
neurofibrillary tangles. In most cases, it is not known whether AD is caused by a 
genetic abnormality or by environmental factors, or both. The pathogenic mutation is 
unknown. 

Another object of the invention is to provide a diagnostic test for AD which 
enables definitive diagnosis of AD in living patients. Furthermore, as AD is a 
progressive disease, it is desirable to diagnose AD as early as possible so that 
preventative action may be taken. 

A number of diagnostic methods have been previously suggested for AD 
diagnosis, most of which have focused on the P-amyloid precursor protein. See for 
example U.S. Patents 4,666,829, 4,816,416 and 4,933,159. However, p-amyloid 
deposits- Have been found in individuals, especially aged persons, who have not shown 
signs of dementia (See J. Biol. Chem., 265, pp 15977, 1990; and Tables 3-5). 
Diagnostic tests based on the P-amyloid protein have therefore been shown to lack 
specificity for AD. 

In U.S. Patent 4,727,041 a diagnostic test for AD is described based on 
measuring levels of somatotropin and somatomedin-C in blood sera following 
administration of an L-dopa proactive test. 

In International patent application WO 94/02851, a method is described for 
identifying AD by the use of antibodies having affinity for paired helical filaments in 
order to determine the levels of paired helical filaments in cerebral spinal fluid. The 
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presence of paired helical filaments is alleged to be indicative of AD. 

Other diagnostic methods are based on the identification of "disease specific 
marker proteins" in the cerebrospinal fluid. In International patent application WO 
95/05604, for example, five disease specific proteins are identified by their molecular 
weights. However, the specific identity of the proteins is unknown and their specific 
relationship to the pathogenesis of AD is also unknown. The five "disease specific 
marker proteins" may therefore be present as a result* of a more fundamental cellular or 
biochemical change. 

Another object of the invention is to provide for detection of AD preferably 
early on in the disease state. It is desirable to detect a protein or substance which is 
either directly responsible for the disease or is involved early on in the pathogenesis of 
the disease, or if not involved is nevertheless generated directly or indirectly by the 
mechanism causing the disease. Such a protein or substance may be the "causative" 
agent to the disease or may be "associated with" the disease state in the sense of being 
diagnostic of the disease state. 

Recently, Sherrington et al. in Nature, 375 . pp 254-260, 1 995, identified a 
gene on chromosome 14 bearing missense mutations which are associated with up to 
33% of autosomal dominant early onset AD cases (Table 1). A missense mutation 
involves a nucleotide substitution, usually a single nucleotide substitution, which 
results in an amino acid substitution at the corresponding codon. The missense 
mutations" disclosed in Sherrington et ai are predicted to change the encoded amino 
acid at the following positions (numbering from the first putative initiation codon) Met 
to Leu at codon 146, His to Arg at codon 163, Ala to Glu at codon 246, Leu to Val at 
codon 286, Cys to Tyr at codon 410. It has been proposed that these mutations may be 
useful in identifying early onset AD. As stated earlier, the majority of AD cases are 
late onset (after 65 years of age; Table 1) and it is therefore still a problem to identify 
the majority of individuals having AD, particularly late onset AD. 

There is no indication that these diseases occur at the RNA level and not at 
the DNA level. Accordingly, the prior art methods of detection are for mutated DNA 
or for a protein encoded by the mutated DNA, and will not give an indication of the 
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presence of a transcript mutation in an RNA molecule. 

Presently, there are a number of substances which are alleged to be useful in 
the treatment of AD. However, so far only limited success has been achieved with 
these substances. Methods for effectively treating and/or preventing AD are still 
5 required (see Allen and Burns, Journal of Psychopharmacology, 2, pp 43-56, 1995). 

SUMMARY OF THE INVENTION 
The present invention is based on the observation that an RNA molecule 
containing a frameshift mutation and encoding a corresponding mutant protein are 
10 correlated with the presence of a disease. 

According to the present invention there is provided a method for the 
diagnosis of a disease caused by or associated with an RNA molecule having one or 
more mutations giving rise to a frameshift mutation comprising: i. providing a 
biological sample, such as a body fluid or tissue sample, from a patient; and ii. 
15 analyzing the sample for the presence of an RNA molecule having a frameshift 

mutation or a mutant protein encoded thereby, wherein the presence of the mutated 
RNA or mutant protein is indicative of the disease. 

A "mutant" protein is a polypeptide encoded by a mutant mRNA at least a 
part of which is in a reading frame that is shifted relative to the initiation start codon 
20 from that of the native or wild-type reading frame, and thus will include any protein 
having an' aberrant carboxy terminal portion which is encoded by the +1 or +2 reading 
frame of the wild type gene sequence. Thus, the mutant protein will include a hybrid 
wild-type/nonsense protein having an amino terminal amino acid sequence that is 
encoded by the wild type (O) reading frame and a carboxy terminal amino acid 
25 sequence that is encoded by the +1 or +2 reading frame, and thus the nonsense portion 
of the mutant protein. The cross-over point between the wild type and nonsense amino 
acid sequences is the codon containing the frameshift mutation. 

The invention is based on the discovery of the presence of such a mutant 
protein or an accumulation of more than one mutant protein in a tissue from a diseased 
30 individual, and also on identification of the mutant protein as indicative of the disease. 
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The invention is also based on the discovery that the mutation that gives rise to the 
mutant protein occurs at the RNA level and not at the DNA level. 

The phrase "caused by or associated with" refers to an RNA molecule which 
is either fully or partly responsible for the disease, or an RNA molecule which is not 
responsible for the disease but is associated with the diseased state in the sense that it 
is diagnostic of the diseased state. 

A disease caused by or associated with at least one RNA molecule having 
one or more mutations giving rise to a frameshift mutation can be any disease 
including non-hereditary cancers, neurodegenerative diseases such as Alzheimer's 
Disease (AD); Downs' syndrome; frontal lobe dementia (Pick's Disease); progressive 
supranuclear palsy (PSP) and other diseases with abundant tau-positive filamentous 
lesions such as Corticobasal degeneration, Dementia pugilistica, Dementia with 
tangles only, Dementia with tangles and calcification, Frontotemporal dementias with 
Parkinsonism linked to chromosome 17, Gertsmann-Strassler-Scheinker disease with 
tangles, Myotonic dystrophy, Niemann-Pick disease type C, Parkinsonism-dementia 
complex of Guam, Postencephalitic Parkinsonism and Subacute sclerosing 
panencephalitis; Parkinson's Disease (PD) amyotrophic lateral sclerosis; Huntington's 
Disease; multiple sclerosis; dementia with Lewy bodies, multisystem atrophy and 
other inclusion body diseases associated with ubiquitin such as Alexander's disease, 
Alcoholic liver disease, lichen amyloidosis, and during aging Marinesco bodies and 
Hyaline inclusions; and other degenerative diseases such as cardiovascular diseases, 
rheumatoid arthritis and Diabetes mellitus type II. Cancers treatable according to the 
invention include, but are not limited to, Hodgkin's disease, acute and chronic 
lymphocytic leukemias, multiple myeloma, breast, ovary, lung, and stomach or 
bladder cancers. 

An RNA molecule having a transcript mutation which leads to a frameshift 
mutation, and herein referred to as the "mutant RNA", can be any RNA molecule 
having at least one transcript mutation which leads to a frameshift mutation. The RNA 
molecule may be any RNA molecule including primary transcripts and messenger 
RNA (mRNA). 
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The term "transcript mutation" refers to a mutation which occurs at the RNA 
level but does not occur in the DNA from which the RNA was transcribed. In order to 
identify transcript mutations a comparison between the RNA and the DNA from which 
the RNA was transcribed has to be made. 
5 A "frameshift mutation" refers to a deletion or insertion of one or more 

nucleotides within an open reading frame, for example, a single nucleotide or 
dinucleotide deletion or insertion, such that the reading frame of the coding region is 
shifted by one or two nucleotides. Preferably, the frameshift mutation is a nucleotide 
or dinucleotide deletion leading to a + 1 or +2 frameshift mutation. However, any 
10 number of nucleotide deletions can occur provided a frameshift mutation results. 
Alternatively, the insertion of one or more nucleotides may give rise to a frameshift 
and such mutations also form part of the present invention. 

Other genetic modifications which give rise to a frameshift also form part of 
the present invention, such as a change in the nucleotide sequence which leads to 
is translation initiation from a different position or a mutation outside a coding region, 
such as within an Intron (if the RNA molecule is a primary transcript), or a 5' or 3' 
untranslated region, which mutation may result in mis-translation and production of a 
mutant protein. 

It is preferred that the mutation is a nucleotide and more preferably a 
20 dinucleotide deletion or insertion associated with the nucleotide sequence GAGA or its 
complementary sequence CTCT of the RNA molecule; especially preferred frameshift 
mutations are associated with the nucleotide sequence of the RNA comprising 
GAGAX or CTCTX, where X is one of G, A, U or C, the preferred motifs being 
GAGAG, GAGAC, GAGAT, and GAG AA as well as CTCTC, CTCTG, CTCTA and 

2 5 CTCTT. As used herein, the term "GAGA mutation" may refer to either a single 

nucleotide insertion or deletion or a dinucleotide insertion or deletion within the 
GAGA or CTCT motif itself or adjacent to (5'- or 3'-terminal to-, and within 5-10 
nucleotides of-) the GAGA or CTCT motif. 

Preferably, the dinucleotide deletion is a GA deletion within the GAGA 

3 0 motif or a GT deletion immediately following (i.e., within 10 nucleotides 3' of) a 
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GAGA motif and a CT deletion in the CTCT motif. It is further preferred that the 
mutant RNA has one or two dinucleotide deletions associated with a GAGA, GAGAC, 
GAGAG, GAGAT or GAGAA, or with a CTCT, CTCTG, CTCTC, CTCTA or 
CTCTT, leading to a + 1 or + 2 frameshift mutation respectively. 

In a preferred embodiment of the invention, the transcript mutations occur in 
RNA molecules of the neuronal system, where the disease is a neurodegenerative 
disease. 

The "neuronal system" is defined as any cells, RNA molecules, proteins or 
substances relating to or forming part of the neuronal system such as nerve cells, glial 
cells, proteins including Tau, P amyloid precursor protein, ubiquitin B, apolipoprotein 
E4, neurofilament proteins and microtubule associated protein II, presenilin I, 
presenilin II, Big Tau, glial fibrillary acidic protein (GFAP), Human P53 cellular 
tumor antigen, human B-cell leukemia/lymphorna 2 (BCL-2) protooncogene, 
semaphorins human homolog of yeast up - frameshift protein 1 (HUPF-I), Human 
Motility Group Protein (HMG), neuron specific protein A (NSP-A) and the RNA 
molecules encoding the proteins. 

Where the disease is a neurodegenerative disease, especially AD, the 
preferred mutant RNA molecules of the present invention are those encoding the P 
amyloid precursor protein, the Tau protein, ubiquitin, apolipoprotein-E 4 (Apo-E 4 ), 
microtubule associated protein II (MAP 2), the neurofilament proteins, presenilin I, 
presenilin" II, Big Tau, GFAP, P53, BCL2, HUPF-I, HMG and NSP-A, having a 
deletion, insertion or other modification leading to a firameshift mutation. The most 
preferred mutant RNA molecules of the present invention are those encoding P 
amyloid precursor protein, ubiquitin B, MAP 2, the neurofilament proteins, presenilin 
I, presenilin II, Big Tau, GFAP, P53, bcl2 and HUPF-I, which have a frameshift 
mutation. 



associated with (within or within 10 nucleotides 5' or 3' of) a GAGA or GAGAX 
sequence leading to a frameshift mutation or a CT or CA dinucleotide deletion 
associated with (within or within 10 nucleotides 5' or 3' of) a CTCT or CTCTX 



It is preferred that the mutation is a GA or a GT dinucleotide deletion 
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sequence leading to a frameshift mutation. It is further preferred that the mutant RNA 
molecule has one or two GA or GT deletions or one or two CT or CA deletions, each 
associated with a GAGA or CTCT sequence or similar motif, leading to a + 1 or + 2 
frameshift mutation, respectively. 
5 The term "mutant protein" as used herein is defined as the protein encoded by 

the mutant RNA molecule of the present invention. 

It is preferred that the methods of the present invention are for the diagnosis 
of a disease caused by or associated with at least one RNA molecule having one or 
more transcript mutations giving rise to a frameshift mutation. A preferred disease for 

10 diagnosis by the present invention is AD, except the early onset AD cases found to be 
linked to chromosome 1,14 and 21. It is further preferred that the methods of the 
present invention are for the diagnosis of young and late onset AD, especially non- 
familial or "sporadic" late onset AD cases. 

As used herein, "biological sample" refers to a body fluid or body tissue 

is which contains proteins and/or cells from which nucleic acids and proteins can be 
isolated. Preferred sources include buccal swabs, blood, sperm, epithelial or other 
tissue, milk, urine, cerebrospinal fluid, sputum, fecal matter, lung aspirates, throat 
swabs, genital swabs and exudates, rectal swabs, and nasopharyngeal aspirates. 

The body fluid sample can be any body fluid which contains cells having the 

20 transcript mutation which gives rise to the frameshift mutation and causes or is 
associated with the diseases. When the disease is a neurodegenerative disease it is 
preferred that the body fluid sample contains cells of the neuronal system or the 
products of such cells. When the disease is a neurodegenerative disease, the preferred 
body fluid is cerebral spinal fluid, which can be obtained after a lumbar puncture 

25 (Lannfelt et aL, Nature Medicine, 1, pp 829-832, 1995). Another preferred body fluid 
is blood (including, but not limited to, venous, arterial and cord blood), as it is easily 
obtained and contains lymphocytes which can be analyzed for the presence of the 
mutant RNA molecule or encoded protein. 

The tissue sample can be any tissue and is preferably one that can be easily 

3 0 obtained, such as skin and nose epithelium. 
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Preferably, when analyzing the sample for a mutant RNA molecule, a nucleic 
acid probe is used. The nucleic acid probe is preferably a nucleotide probe having a 
sequence complementary to part of the mutant RNA molecule encompassing the 
mutation giving rise to the frameshift mutation. 
5 The probe must be used to detect RNA or DNA reverse transcribed from the 

RNA, but must not be used to detect genomic DNA as the genomic DNA will not 
contain the mutation. 

The present invention further provides a nucleic acid probe having a 
sequence complementary to part of the mutant RNA molecule encompassing the 
10 mutation leading to the frameshift mutation. The probe is preferably sufficiently 
complementary to the mutant sequence of the RNA molecule so that under stringent 
conditions the probe only remains bound to the mutant sequence, and is able to 
distinguish under stringent conditions the mutant and corresponding wild-type 
transcripts. "Stringent" conditions are defined herein as RNA:DNA hybridization 
is conditions which may be performed at 65 °C using a hybridization buffer equivalent to 
50% formamide and 0.1X SSC (see below and Evans et al. PNAS (1994) 9; 6059- 
6063, 6060). "Stringent" conditions also preferably include stringent washes, as 
described in Evans et al. (Ibid). 

The probe may be of any length but is preferably between 5 and 50 
20 nucleotides long, more preferably between 10 and 30 nucleotides long. For example, 
the probe may be 5, 10, 15, 20, 25, or 30 nucleotides in length. 

In a preferred embodiment the probe comprises a sequence complementary to 
a GAGA or GAGAX or to a CTCT or CTCTX, having a nucleotide or dinucleotide 
deletion or insertion, and nucleotide sequences corresponding to the nucleotide 
25 sequences flanking the GAGA or CTCT motif in the wild-type RNA molecule. It 
would be apparent to one skilled in the art that if reverse transcribed DNA 
complementary to the mutant RNA sequence was being probed for, a probe 
comprising a sequence complementary to the corresponding GAGA or CTCT motif 
present in the complementary DNA would have to be used. 
3 0 Methods of detecting the presence of the mutant RNA molecule include the 
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reverse transcriptase polymerase chain reaction (RT-PCR) using primers having a 
sequence complementary to the sequence either side of the mutation which gives rise 
to the frameshift mutation. Firstly, one primer is used to reverse transcribe the RNA 
into DNA, and secondly, two primers are used to amplify the DNA, as described 
5 hereinbelow. 

The primers used in the above RT-PCR based method can vary in size from 
20bp to 2-3 kb; for example, 20bp, 50bp, lOObp, 50Qbp, lOOObp, 1500bp, 2000bp, or 
3000bp. The primers can be prepared by a number of standard techniques including 
cloning the sequences flanking the nucleotide region to be amplified or by 
10 synthesizing the primers using phosphoramidite method. 

The present invention further provides primers for use in the above defined 
RT-PCR based methods for the amplification of the nucleotide region containing the 
mutation. 

Preferably, when analyzing the sample for the mutant protein of the present 
15 invention an immunological test is employed. The immunological test is preferably 
based on the use an antibody molecule having specificity for the mutant protein of the 
present invention and not the wild-type protein. 

The present invention thus further provides an antibody molecule having 
specificity for the mutated protein of the present invention but not for the wild-type 
20 protein. Preferably, the antibody is specific for the carboxy terminal end of the mutant, 
protein. 

The present invention further provides a method for the diagnosis of a 
neurodegenerative disease or other age-related diseases, or a method for the diagnosis 
of a person with a susceptibility for these diseases comprising: i. providing a body 

2 5 fluid or tissue sample from a patient; and ii. analyzing the sample for the presence of 

an RNA molecule of the neuronal system having a frameshift mutation or a protein 
encoded thereby, wherein the presence of the mutated RNA molecule is indicative of a 
neurodegenerative disease. 

Preferably, the neurodegenerative disease is AD and Downs' syndrome. 

3 o The present invention also relates to methods for preventing and/or treating 
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the diseases, vectors for preventing and/or treating the diseases and for the production 
of diagnostic reagents, compositions for preventing and/or treating the diseases, 
nucleic acid sequences, probes and antibody molecules for use in the present invention 
and transgenic animals. 
5 Therapies contemplated according to the invention include providing to a cell 

containing a mutant transcript a ribozyme which is capable of selectively eliminating 
(i.e., cleaving) the mutant transcript, thus rendering the transcript untranslatable. 

The therapies also may include providing to a cell which is thus treated with 
a ribozyme a corresponding wild-type transcript which is substantially uncleavable by 

10 the ribozyme. The wild-type transcript may contain the wild-type sequence 
corresponding to the mutant RNA sequence, except for the GAGA or CTCT 
permutation, and encoding the wild-type protein, and also may include third base (in a 
codon) silent mutations which further differentiate the wild-type RNA from the mutant 
RNA sequence, and thus further distinguish the sequences with respect to ribozyme 

is recognition and cleavage. 

Therapies encompassed by the invention also include providing to cells 
containing a mutant RNA, an RNA or DNA that is complementary to the mutant RNA 
and able to form a duplex with the mutant RNA that is untranslatable in the cell. The 
complementary sequence may be the entire length of the mutant RNA, but is 

20 preferably a shorter length, for example, 10, 20, 50 or 100 nucleotides in length. The 
complementary sequence thus may be administered in the form of an oligonucleotide 
or may be encoded by an expressible sequence contained in a vector, wherein the 
vector is administered to the cell. 

The invention therefore encompasses a pharmaceutical composition 

2 5 comprising a ribozyme that selectively cleaves a target RNA having a GAGA or 

CTCT mutation admixed with a pharmaceutically acceptable carrier. 

The invention also encompasses a pharmaceutical composition comprising a 
ribozyme that selectively cleaves a target RNA having a GAGA or CTCT mutation 
and a wild-type analog of an RNA having a GAGA or CTCT sequence giving rise to a 

3 0 frameshift mutation admixed with a pharmaceutically acceptable carrier. 
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The invention also encompasses a pharmaceutical composition comprising a 
wild-type analog of an RNA having a GAGA or CTCT sequence giving rise to a 
frameshift mutation admixed with a pharmaceutically acceptable carrier. 

The invention also encompasses a pharmaceutical composition wherein the 
5 wild-type analog of an RNA comprises a nucleotide sequence having third base silent 
mutations. 

The invention also encompasses a pharmaceutical composition comprising a 
single stranded nucleic acid having a sequence that is complementary to an RNA 
having one or more GAGA or CTCT mutations giving rise to a frameshift mutation 
10 admixed with a pharmaceutically acceptable carrier. 

The invention also encompasses a pharmaceutical composition comprising 
the wild-type analog of a mutant protein in admixture with a pharmaceutically 
acceptable carrier. 

The invention also encompasses a vector comprising an expressible gene 
15 encoding a ribozyme that selectively cleaves a target RNA having a GAGA or CTCT 
sequence. 

The invention also encompasses a vector comprising an expressible gene 
encoding a sequence complementary to an RNA having a GAGA or CTCT mutation 
giving rise to a frameshift mutation. 

20 The invention also encompasses a host cell containing a vector as described 

herein. - " 

The invention also encompasses a method of treatment and/or prevention of a 
disease caused by or associated with an RNA having a GAGA or CTCT mutation 
giving rise to a frameshift mutation, comprising administering the compositions, 

2 5 vectors, or the host cells described above to a patient suffering from or susceptible to 

the disease. 

The invention also encompasses the use of a vector encoding a ribozyme that 
selectively cleaves a target RNA having a GAGA or CTCT mutation under the control 
of a promoter in therapy. 

3 0 The invention also encompasses the use of a vector encoding a ribozyme 
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under the control of a promoter in the manufacture of a composition for the treatment 
of a disease caused by or associated with at least one an RNA having one or more 
GAGA or CTCT mutations giving rise to a frameshift mutation. 

The invention also encompasses the use of a vector encoding the sequence 
5 complementary to an RNA having one or more GAGA or CTCT mutations giving rise 
to a frameshift mutation under the control of a promoter in therapy. 

The invention also encompasses the use of more than one of the 
compositions, the vectors, or the host cells described above in any combination in 
therapy. 

! o The invention also encompasses the use of more than one of the 

compositions, the vectors, or the host cells described herein in any combination in the 
treatment and/or prevention of a disease caused by or associated with at least one an 
RNA having one or more GAGA or CTCT mutations giving rise to a frameshift 
mutation. 

is The present invention further provides an early marker for a 

neurodegenerative disease. The invention provides a diagnostic kit for diagnosing a 
disease caused by or associated with at least one RNA molecule having one or more 
transcript mutations giving rise to a frameshift mutation comprising: i. a nucleic acid 
probe having a sequence complementary to part of the mutant RNA molecule which 

20 encompasses the mutation which leads to the frameshift mutation and packaging 
materials therefor; and ii. means for detecting the probe bound to the mutant RNA 
molecule. 

The present invention further provides a diagnostic kit for diagnosing a 
disease caused by or associated with at least one RNA molecule having one or more 

2 5 transcript mutations giving rise to a frameshift mutation comprising: i. primers for use 
in an RT-PCR reaction, the primers having a sequence complementary to the sequence 
either side of the mutation which gives rise to the frameshift mutation, packaging 
materials therefor, and reagents necessary for performing an RT-PCR reaction and 
amplifying the DNA sequence containing the mutation; and ii. means for detecting the 

30 amplified DNA sequence containing the mutation. 
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The present invention further provides a diagnostic kit for diagnosing a 
disease caused by or associated with at least one RNA molecule having one or more 
transcript mutations giving rise to a frameshift mutation comprising: i. an antibody 
molecule having specificity for the mutant protein of the present invention and not the 
5 wild-type protein; and ii. means for detecting the antibody molecule bound to the 
mutant protein. 

The antibody molecule and the means for detecting the bound antibody 
molecule are as defined above. 

In a further embodiment of the present invention the diagnostic kit described 
10 above additionally comprising: i. an antibody molecule having specificity for the wild- 
type protein; and ii. means for detecting the antibody molecule bound to the wild-type 
protein, as a control for diagnosing a disease caused by or associated with at least one 
RNA molecule having one or more transcript mutations giving rise to a frameshift 
mutation. 

is The present invention further provides an RNA molecule having one or more 

transcript mutations giving rise to a frameshift mutation which causes or is associated 
with a disease. 

The invention further provides several (one or more) RNA molecules 
encoding the same amino acid sequence up to the GAGA or CTCT motif, and 

20 thereafter encoding different sequences. For example, in a single cell, one RNA 

molecule "encoding a frameshifted protein based on, for example, p-app, may contain a 
mutation at or within the GAGA motif in exon 9 of the RNA sequence and a second 
RNA molecule encoding p-app may contain a mutation at or within the GAGA motif 
in exon 1 0 of the sequence. 

25 The present invention further provides a mutated protein encoded by the 

mutated RNA molecule found to be indicative of a disease, the mutant RNA molecule 
having one or more transcript mutations giving rise to a frameshift mutation. 
Preferably, the mutant protein contains an antigenic epitope specific for the diseased 
state, examples of which are provided in Table 9. 

3 0 In a preferred embodiment of the present invention the mutated RNA 
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molecule encodes a protein comprising at least part of the sequence designated +1 or 

+2 in any one of Figures 2 to 19, or an immunologically equivalent fragment thereof. 
In a preferred embodiment the mutated protein comprises any one of the 

following individual sequences: RGRTSSKELA [SEQ ID NO: 1]; HGRLAPARHAS 
5 [SEQ ID NO: 2]; YADLREDPDRQ [SEQ ID NO: 3]; RQDHHPGSGAQ [SEQ ID 

NO: 4]; YADLREDPDRQDHHPGSGAQ [SEQ ID NO: 1400]; GGGAQ [SEQ ID 

NO: 5], GAPRLPPAQAA [SEQ ID NO: 6]; KTRFQRKGPS [SEQ ID NO: 7]; 

PGNRSMGHE [SEQ ID NO: 8]; EAEGGSRS [SEQ ID NO: 9]; VGAARDSRAA 

[SEQ ID NO: 10]; HDYPPGGSV [SEQ ID NO: 1 1]; SIQKFQV [SEQ ID NO: 12]; 
10 VEKPGERGGR [SEQ ID NO: 13]; PLFGRGHKRG [SEQ ID NO: 14]; 

EDRGDAGWRGH [SEQ ID NO: 15]; QERGASPRAAPREH [SEQ ID NO: 16]; 

RQPGDVAPGGQHRPVDD [SEQ ID NO: 17]; AGLLAIPEAK [SEQ ID NO: 18]; 

YVDVYNGGKFS [SEQ ED NO: 19]; AADERRCHLLHMCGRR [SEQ ID NO: 20; 

QQATEAGQHYQPGSPLHDHSHV [SEQ ID NO: 21]; PQEAAARTNR [SEQ ID 
is NO: 22]; RSWVHPAPPYQMCLG [SEQ ID NO: 23]; and GGSRTHPR [SEQ ID 

NO: 24], especially when the disease is a neurodegenerative disease such as AD. 

In a preferred embodiment, the antibody molecule of the present invention 

has affinity for the mutant proteins defined above. 

The present invention also relates to a method for treating and/or preventing 
20 a disease caused by or associated with at least one RNA molecule having one or more 

transcript 'mutations giving rise to a frameshift mutation. The finding of mutations in 

RNA molecules which lead to the production of mutant proteins, and which are 

indicative of a disease, has led to a number of ways of treating and/or preventing the 

disease. 

25 The present invention further provides a method for identifying diseases 

caused by or associated with at least one RNA molecule having one or more transcript 
mutations giving rise to a frameshift mutation. The method comprises: i. providing 
the sequence of an RNA molecule suspected of being involved in the pathogenesis of a 
disease; ii. identifying the sequence of the mutant protein encoded by the RNA 

3 0 sequence 3'-terminal to a frameshift mutation; iii. preparing a probe to the mutant 
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protein or a fragment thereof; and iv. probing a body fluid or tissue sample from a 
patient having the disease and a patient not having the disease, in order to find a 
correlation between the presence of the mutant protein and the diseased state. 

Preferably, the probe is an antibody molecule as defined herein. It is further 
preferred that the antibody molecule has affinity for a protein comprising at least one 
of the sequences: RGRTSSKELA [SEQ ID NO: 1]; HGRLAPARHAS [SEQ ED NO: 
2]; YADLREDPDRQ [SEQ ID NO: 3]; RQDHHPGSGAQ [SEQ ID NO: 4]; 
YADLREDPDRQDHHPGSGAQ [SEQ ID NO: 1400]; GGGAQ [SEQ ID NO: 5], 
GAPRLPPAQAA [SEQ ID NO: 6]; KTRFQRKGPS [SEQ ID NO: 7]; 
PGNRSMGHE [SEQ ID NO: 8]; EAEGGSRS [SEQ ID NO: 9]; VGAARDSRAA 
[SEQ ID NO: 10]; HDYPPGGSV [SEQ ID NO: 11]; SIQKFQV [SEQ ID NO: 12]; 
VEKPGERGGR [SEQ ID NO: 13]; PLFGRGHKRG [SEQ ID NO: 14]; 
EDRGDAGWRGH [SEQ ID NO: 15]; QERGASPRAAPREH [SEQ ID NO: 16]; 
RQPGDVAPGGQHRPVDD [SEQ ID NO: 17]; AGLLAIPEAK [SEQ ID NO: 18]; 
YVDVYNGGKFS [SEQ ID NO: 19]; AADERRCHLLHMCGRR [SEQ ID NO: 20; 
QQATEAGQHYQPGSPLHDHSHV [SEQ ID NO: 21]; PQEAAARTNR [SEQ ID 
NO: 22]; RSWVHPAPPYQMCLG [SEQ ID NO: 23]; and GGSRTHPR [SEQ ID 
NO: 24], especially when the disease is a neurodegenerative disease such as AD. 

Other features and advantages of the invention will be apparent from the 
following description of the preferred embodiments thereof, and from the claims. 

BRIEF DESCRIPTION OF DRAWINGS 
The invention is now illustrated in the appended example with reference to 
the following drawings: 

Figure 1 is a copy of a paraffin section of the frontal cortex of a female 
Alzheimer patient (70 years old, #83002; Table 2) immunocytochemically incubated 
with an antibody against a peptide predicted by the + 1 reading frame of P APP (Figure 
20). Dystrophic neurites (arrowheads) and tangles (arrows) are clearly visible in the 
cortical layer III. 

Figure 2 presents the coding nucleotide sequence of the human P amyloid 
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precursor protein gene transcript [SEQ ID NO: 25], the amino acid sequence of the 
wild-type protein [SEQ ID NO: 83 and 84], the mutant + 1 frameshift protein [SEQ 
ED NO: 50-82] and the mutant + 2 frameshift protein [SEQ ID NO: 26-49]. . 

Figure 3 presents the coding nucleotide sequence of the human microtubule- 
5 associated protein tau gene transcript [SEQ ID NO: 85], the amino acid sequence of 
the wild-type protein [SEQ ID NO: 99], the mutant + 1 frameshift protein [SEQ ED 
NO: 86-98] and the mutant + 2 frameshift protein [SEQ ED NO: 100-112]. 

Figure 4 presents the coding nucleotide sequence of the human ubiquitin B 
gene transcript [SEQ ID NO: 113], the amino acid sequence of the wild-type protein 
10 [SEQ ID NO: 125 and 126], the mutant + 1 frameshift protein [SEQ ID NO: 114- 
124] and the mutant + 2 frameshift protein [SEQ ID NO: 127-136]. 

Figure 5 presents the coding nucleotide sequence of the human 
apolipoprotein E gene transcript [SEQ ID NO: 137], the amino acid sequence of the 
wild-type protein [SEQ ID NO: 144-146], the mutant + 1 frameshift protein [SEQ ID 
is NO: 138-143] and the mutant + 2 frameshift protein [SEQ ID NO: 147-152]. 
Information concerning restriction enzyme sites is also given. 

Figure 6 presents the coding nucleotide sequence of the human microtubule- 
associated protein 2 transcript [SEQ ID NO: 153], the amino acid sequence of the 
wild-type protein [SEQ ID NO: 154-158], the mutant + 1 frameshift protein [SEQ ID 
20 NO: 232-347] and the mutant + 2 frameshift protein [SEQ ID NO: 159-231]. 

- Figure 7 presents the coding nucleotide sequence of the human neurofilament 
subunit NF-low transcript [SEQ ID NO: 348], the amino acid sequence of the wild- 
type protein [SEQ ID NO: 466-513], the mutant + 1 frameshift protein [SEQ ID NO: 
413-465] and the mutant + 2 frameshift protein [SEQ ID NO: 349-412]. 
2 5 Figure 8 presents the coding nucleotide sequence of the human neurofilament 

subunit NF-M transcript [SEQ ID NO: 514], the amino acid sequence of the wild-type 
protein [SEQ ID NO: 515-574], the mutant + 1 frameshift protein [SEQ ED NO: 629- 
695] and the mutant + 2 frameshift protein [SEQ ED NO: 575-628]. 

Figure 9 presents the coding nucleotide sequence of the human neurofilament 
30 subunit NF-H gene transcript [SEQ ID NO: 696], the amino acid sequence of the 
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wild-type protein [SEQ ID NO: 697-698], the mutant + 1 frameshift protein [SEQ ID 
NO: 708-710] and the mutant + 2 frameshift protein [SEQ ID NO: 699-707]. 

Figure 10 presents the coding mRNA nucleotide sequence [SEQ ID NO: 
711] and amino acid sequence of presenilin I expressed in the wildtype [SEQ ID NO: 
5 712], +1 [SEQ ID NO: 733-753], and +2 [SEQ ID NO: 713-732] reading frames. 
Figure 1 1 presents the coding mRNA nucleotide sequence [SEQ ID NO: 754]and 
amino acid sequence of presenilin II expressed in the wildtype [SEQ ID NO: 776- 
787], +1 [SEQ ID NO: 755-775], and +2 [SEQ ID NO: 788-814] reading frames. 
Figure 12 presents the coding mRNA nucleotide sequence [SEQ ID NO: 
10 815] and amino acid sequence of Big Tau expressed in the wildtype [SEQ ED NO: 

816-818], 4-1 [SEQ ID NO: 824-834] and +2 [SEQ ID NO: 819-823] reading frames. 

Figure 13 presents the coding mRNA nucleotide sequence [SEQ ID NO: 
835] and amino acid sequence of GFAP expressed in the wildtype [SEQ ID NO: 836- 
852], +1 [SEQ ID NO: 883-914], and +2 [SEQ ID NO: 853-882] reading frames, 
is Figure 14 presents the coding mRNA nucleotide sequence [SEQ ID NO: 

915] and amino acid sequence of P53 expressed in the wildtype [SEQ ID NO: 940- 
949],+! [SEQ ID NO: 916-939] and +2 [SEQ ID NO: 950-965] reading frames. 

Figure 15 presents the coding mRNA nucleotide sequence [SEQ ID NO: 
966] and amino acid sequence of BCL2 expressed in the wildtype [SEQ ID NO: 967- 
20 1015], +1 [SEQ ID NO: 1 075- i 126] and +2 [SEQ ID NO: 1016-1074] reading 
frames. - 

Figure 16 presents the coding mRNA nucleotide sequence [SEQ ID NO: 
1 127] and amino acid sequence of Semaphorin III expressed in the wildtype [SEQ ID 
NO: 1128-1131], +1 [SEQ ID NO: 1 162-1212] and +2 [SEQ ID NO: 1132-1161] 

2 5 reading frames. 

Figure 17 presents the coding mRNA nucleotide sequence [SEQ ID NO: 
1213] and amino acid sequence of HUPF expressed in the wildtype [SEQ ID NO: 
1241-1244], +1 [SEQ ID NO: 1214-1240] and +2 [SEQ ID NO: 1245-128 l]reading 
frames, 

3 o Figure 1 8 presents the coding mRNA nucleotide sequence [SEQ ID NO: 



18 




- ■ 1 

WO 98/45322 PCT/IB98/00705 



% 



1282] and amino acid sequence of HMG expressed in the wildtype [SEQ ID NO: 
1297-1299], +1 [SEQ ID NO: 1289-1296] and +2 [SEQ ID NO: 1283-1288] reading 
frames. 

Figure 19 presents the coding mRNA nucleotide sequence [SEQ ED NO: 
5 1300] and amino acid sequence of NSP-A expressed in the wildtype [SEQ ID NO: 
1374-1387], +1[SEQ ED NO: 1339-1373], and +2 reading frames [SEQ ID NO: 
1301-1338]. 

Figure 20 presents the partial mRNA nucleotide sequence and amino acid 
sequence of two human neuronal proteins (P amyloid precursor protein (exons 9 and 
10 10) and Ubiquitin B (exon 2)) expressed in the wildtype and +1 reading frame. 

Figure 21: Two examples of novel restriction sites generated by dinucleotide 
deletion in transcripts of p amyloid precursor protein and ubiquitin B (wild-type 
nucleotide sequences, [SEQ ID NO: 25 and 1 13];P amyloid precursor protein deletion 
sequences [SEQ ID NO: 1396 and 1397]; ubiquitin deletion sequences [SEQ ID NO: 
15 1398-1399]. 

DESCRIPTION 

The invention is illustrated by the following nonlimiting examples wherein 
the following materials and methods are employed. The entire disclosure of each of 
20 the literature references cited hereinafter are incorporated by reference herein. 

-The present invention is based on the discovery that frameshift mutations 
occur in a single RNA molecule or number of RNA molecules whose product or 
products are mutant proteins that are associated with, and indicative of, a disease state. 
The invention is based on the recognition that the presence of a frameshift mutation 
25 results in a new coding sequence for the cell containing the frameshift mutation, and 
thus a new polypeptide (herein termed a mutant protein) which may be correlated with 
and thus be indicative of a disease. 

According to the present invention, diagnosis and/or identification of a 
disease caused by or associated with at least one RNA molecule having one or more 
3 0 transcript mutations which give rise to a frameshift mutation is accomplished as 
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described herein. 

According to the present invention, methods for preventing and/or treating 
the diseases, vectors for preventing and/or treating the diseases and for the production 
of diagnostic reagents, compositions for preventing and/or treating the diseases, 
5 nucleic acid sequences, probes and antibody molecules for use in the present invention 
and transgenic animals are accomplished as described herein. 

According to the present invention, methods* for detecting errors in 
transcriptional mechanisms are accomplished as described herein. The correction of 
the mutations found in the mutant RNA molecules of the present invention is therefore 
10 a valuable method for combatting diseases. 

Methods and reagents for disease diagnosis and treatment are described in 
more detail hereinbelow. 

Diagnosis of Diseases According to the Invention 

15 The invention relates to methods for diagnosing diseases caused by or 

associated with at least one RNA molecule having one or more transcript mutations 
which give rise to a frameshift mutation. Such diseases include but are not limited to 
cancers, Diabetes mellitus type II and neurodegenerative diseases such as Parkinson's 
Disease (PD), Alzheimer's Disease (AD), frontal lobe dementia (Pick's Disease), 

2 o progressive supranuclear palsy (PSP) and other diseases with abundant tau-positive 
filamentous lesions such as Corticobasal degeneration, Dementia pugilistica, Dementia 
with tangles only, Dementia with tangles and calcification, Frontotemporal dementias 
with Parkinsonism linked to chromosome 17, Gertsmann-Strassler-Scheinker disease 
with tangles, Myotonic dystrophy, Niemann-Pick disease type C, Parkinsonism- 

2 5 dementia complex of Guam, Postencephalitic Parkinsonism, Subacute sclerosing 

panencephalitis, amyotrophic lateral sclerosis, Huntington's Disease, dementia with 
Lewy bodies, multisystem atrophy, other inclusion body diseases associated with 
ubiquitin such as Alexander's disease, Alcoholic liver disease, lichen amyloidosis, and 
during aging Marinesco bodies and Hyaline inclusions, multiple sclerosis, Downs' 

3 o syndrome, and other degenerative diseases such as cardiovascular diseases rheumatoid 
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arthritis, and Diabetes mellitus type II. 

Somatic mutations can result in a different gene function and have been 
implicated in diseases associated with aging, such as certain cancers. However, it has 
generally been assumed that non-proliferating cells do not undergo important changes 
5 at the genomic level. For example, it was assumed previously that genomic changes 
are mainly related to cell proliferation (Smith, Mutation Research, 277 , pp 139-142, 
1992) which for non-proliferating cells such as most neurons ends during early 
postnatal life (Rakic, Science, 222, pp 1054-1056, 1985). However, Evans et ah, 
1994, Proc. Nat. Acad. Sci. 91:6059, suggested that somatic mutations do occur in 

10 genes of the neuronal system, i.e., in post-mitotic neurons. The di/di Brattleboro rat, 
which suffers from severe diabetes insipidus due to the absence of the antidiuretic 
hormone vasopressin (VP), was the subject of the Evans et al. paper. It had previously 
been established that the VP hormone was absent in the Brattleboro rat due to a 
deletion of a single G residue in the second exon of the VP gene, resulting in a mutant 

is VP precursor with an altered C-terminal amino acid sequence. It had also been 

observed that a small number of neurons in the di/di rat exhibited a heterozygous +/di 
phenotype and expressed an apparently normal VP gene product. In studying the 
molecular biology of the di/di rat, Evans et al. identified sequence alterations that 
restored the reading frame of the mutant VP precursor mRNA, which were based on a 

20 di-nucleotide deletion in a GAGAG motif. They correlated the presence of small 

amounts of normal VP gene product in single magnocellular neurons with a reversion 
of the mutant gene stemming from a frameshift mutation. Evans et al. concluded that, 
because +1 frameshift mutations are present in VP transcripts of both wild-type rats 
and di/di rats, the events leading to these mutations are not caused by the diseased 

25 state of the di/di rat per se. Thus, Evans et al. did not correlate a mutational GAGAG 
hotspot with a disease state, or predeliction to a disease. Furthermore, there is no 
suggestion in the prior art that transcript mutations are occurring and that such 
transcript mutations are caused by or associated with a disease. As the mutations have 
previously been considered to occur in DNA, methods of detection have been 

3 0 unreliable as there will be no mutation in the genomic DNA and the probing of 
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genomic DNA will give a false indication of the absence of the mutation. 

In the present invention, the observations of Evans et al., as to reversions in 
the wild-type reading frame at GAGAG hotspots in VP transcripts within single 
neurons of the di/di rat leading to wild-type-like VP gene products, is extended and 
developed. According to the present invention, a human disease which is caused by or 
associated with at least one RNA molecule having one or more transcript mutations 
occurring at a mutational hotspot and which give rise* to a frameshift mutation is 
identified and/or diagnosed. The nucleotide sequence of an RNA molecule suspected 
of being involved in the pathogenesis of a disease is provided, e.g., from published 
gene sequences or from cloning and sequencing of a suspect RNA molecule. The 
amino acid sequence encoded by the RNA molecule is then predicted, as are amino 
acid sequences of encoded mutant proteins. Mutant protein sequences are predicted in 
+ 1 and +2 reading frames following a hypothesized frameshift mutation. The location 
of the frameshift mutation may be hypothesized with respect to certain nucleotide 
sequence motifs which are suspected of causing frameshift mutations, examples of 
such motifs present in the RNA molecule including but not limited to those 
comprising GAGA, for example, GAGAC, GAGAG, GAGAT, and GAGAA, or those 
comprising CTCT, for example CTCTC, CTCTG, CTCTA and CTCTT. 

A probe is then prepared that is specific for the mutant protein or an 
immunogenic fragment thereof (such probes being described hereinabove for detection 
of protein's or protein fragments). Depending on where the mutation that leads to the 
frameshift occurs, part of the mutant protein will have the same sequence as the wild- 
type protein and part of the protein will have the sequence of the mutant protein. 
Furthermore, depending on where the mutation occurs the mutant protein will 
terminate when the nucleotide sequence codes for a stop codon (indicated as * in the 
Figures). Thus, different mutant proteins will be produced depending on where the 
mutation occurs. 

Alzheimer's Disease (AD) is a representative disease diagnosable and 
treatable according to the invention. AD is a neurodegenerative disease characterised 
by idiopathic progressive dementia and is the fourth highest major cause of death in 
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developed countries. It affects 5 to 1 1% of the population over the age of 65 and as 
much as 47% of the population over the age of 85. At present there are an estimated 4 
million patients suffering from AD in the U.S.A. (see Coleman, Neurobiol. of Aging, 
Jj>, Suppl. 2, pp 577-578, 1994), and an estimated 20 million Alzheimer's patients 
5 worldwide. 

The clinical criteria for AD diagnosis have been defined (see Reisberg et al., 
Am. J. Psych. 12, pp 1 136-1 139, 1982; McKhann et al f Neurology, 24, PP 939-944, 
1984). The early symptoms of AD vary but generally include depression, paranoia 
and anxiety. There is also a slow degeneration of intellectual function and memory. 

10 In particular, cognitive dysfunction and specific disturbances of speech (aphasia), 
motor activity (apraxia), and recognition of perception (agnosia) can occur. 

There is not yet general consensus in a test for ante mortem diagnosis for AD 
due to the lack of knowledge of the pathogenic mechanisms involved in AD. 
Diagnosis of AD is made by examination of brain tissue. Such diagnosis is usually 

is carried out on individuals post mortem. The diagnosis is based on the presence of a 
large number of intraneuronal neurofibrillary tangles and of neuritic plaques in the 
brain tissue, in particular in the neocortex and hippocampus. In order to identify the 
various types of plaques (e.g. neuritic plaques), neuropil threads and neurofibrillary 
tangles, staining and microscopic examination of several brain tissue sections is 

20 necessary. Neuritic plaques are believed to be composed of degenerating axons (e.g., 
neuropiUhreads), nerve terminals and possibly astrocytic and microglial elements. It 
is also often found that neuritic plaques have an amyloid protein core. The 
neurofibrillary tangles comprise normal and paired helical filaments and are believed 
to consist of several proteins. 

25 There are two major types of AD, late onset (>65 years) and early onset (<65 

years). Approximately 85% of all AD cases are late onset and only 15% are early 
onset. Of the latter group 0.3% consists of the hereditary type of AD linked to 
chromosome 21, 2% of the cases are considered to be linked to chromosome 14, and 
chromosome 1 has been established for juvenile onset (<0.1%), as discussed below. 

3 0 Sporadic cases are the most prominent group (40%) in early onset AD. 
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In the most common late onset group, 40% of cases are considered to be 
familial, meaning that Alzheimer was observed in first degree relatives. Of this 
familial form only 10% is autosomal dominant. The remaining late onset cases (60%) 
are non-familial or "sporadic" cases (see Table 1). For these cases relatively little is 
known and previously no data was available which suggested a possible cause of AD. 

At present, it is unclear whether the formation of neuritic plaques and/or 
neurofibrillary tangles is directly responsible for causing AD. The formation of 
neuritic plaques, neuropil threads and/or neurofibrillary tangles may be a consequence 
of a more fundamental cellular or biochemical change. 

Diagnostic methods of the invention will include the detection of nucleic acid 
sequences, preferably via procedures which involve formation of a nucleic acid duplex 
between two nucleic acid strands, i.e., a nucleic acid probe and a complementary 
sequence in the mutant RNA or the DNA reverse transcribed from the mutant RNA 
isolated from a biological sample, or detection of a protein, preferably a mutant or 
hybrid wild-type/nonsense protein, as defined herein. 

1. Preparation and Detection of RNA for Genetic Screening. 

Typically, RNA is prepared from the biological sample by DNA extraction 
procedures well-known in the art (see, e.g., Sambrook et al M 1990, A Laboratory 
Manual for Cloning, Cold Spring Harbor Press, CSH, NY), and may be further 
purified -if desired, e.g., by electro-elution, prior to analysis. 

Methods of detecting a mutant RNA molecule from a biological sample 
include, but are not limited to the following: (1) reverse transcriptase polymerase chain 
reaction (RT-PCR) followed by sizing gel electrophoresis or hybridization with an 
allele-specific (or sequence-specific) probe; (2) hybridization of the eluted RNA with a 
nucleic acid probe that is complementary to the mutated RNA; (3) the ARMS test, in 
which one primer has a complementary sequence encompassing the mutation which 
gives rise to the frameshift mutation, and amplification only occurs if the mutated 
sequence is present; (4) nucleotide sequencing; (5) RNA amplification via RT-PCR 
and T7 polymerase; and (6) by a dinucleotide deletion in the RNA after RT-PCR a 
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cDNA can be generated with novel restriction sites (Figure 21). 

A nucleic acid probe useful according to the invention is preferably 
sufficiently complementary to the mutant sequence of the RNA molecule so that under 
stringent conditions the probe only remains bound to the mutant sequence (see Evans 
et al. y Proc. Natl. Acad. Sci. USA, 91 :6059-6063 (1994). The probe is preferably 
labelled using any of the standard techniques known to those skilled in the art, such as 
radioactively using 32 P or any other standard isotopes, "or using non-radioactive 
methods including biotin or DIG labelling. The labelled probe can then be easily 
detected by methods well known to those skilled in the art. 

An alternative method for detecting the presence of the mutant RNA 
molecule is via the reverse transcriptase polymerase chain reaction (RT-PCR). 
Primers having a sequence complementary to the sequence either side of the mutation 
which gives rise to the frameshift mutation are used to reverse transcribe the RNA and 
amplify the reverse transcribed DNA containing the mutation. The mutation in the 
amplified fragment can then be detected using the probe described above using 
standard techniques or by sequencing the amplified fragment. The advantages of 
using the RT-PCR reaction is that less starting material is required and the PCR 
methods allow quantitative as well as qualitative determinations to be made. 
Quantitative determinations allow the number of copies of a mutated RNA molecule 
present in a particular sample to be estimated, and given this information the severity 
of the diseased state can be estimated. 

Another alternative method for detecting the presence of the mutant RNA 
molecule is one in which one primer has a complementary sequence encompassing the 
mutation which gives rise to the frameshift mutation. Amplification will therefore 
only occur if the mutated sequence is present. Newton et ah, Nucl. Acids. Res. 
17:2503, 1989. The method has previously been used in detecting mutations in the 
gene responsible for cystic fibrosis, and one skilled in the art could easily perform this 
test for the detection of the mutant RNA or the reverse transcribed DNA 
corresponding to the mutant RNA of the present invention. 

An example of analysis method (1) follows. The RNA is reverse transcribed 
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and the DNA then amplified, e.g., using PCR, prior to analysis. Specific conditions 
for any one PCR, i.e. a PCR targeting a particular sequence, or for any one multiplex 
PCR, i.e. a PCR targeting a particular set of sequences, may vary but will be known to 
a person of ordinary skill in the art. 

Amplification of a mutated or wild-type reverse transcribed DNA sequence 
can be accomplished directly from an aliquot of the prepared DNA as follows. 

25 jxl of DNA is aliquotted into a reaction tube containing 25 p.1 H 2 0, 50 
master mix (see below), 0.5 \x\ Amplitaq (Perkin Elmer Cetus, Norwalk, CT) and 0.5 
Hi UNG (Perkin Elmer Cetus, Nonvalk, CT). A 50 \il master mix comprises 20 mM 
Tris HC1, pH 83, 100 mM KCl, 5 mM MgCl 2 , 0.02 ^moles each of dATP, dGTP, 
dCTP, 0.04 ^moles of dUTP, 20 pmoles of each primer (Perkin Elmer Cetus, 
Norwalk, CT), and 25 fjtg gelatin. 

A fragment characteristic of the selected amplification sequence can then be 
visualized under ultraviolet light after ethidium bromide staining a 13% 
polyacrylamide gel in which an aliquot of the amplification has been electrophoresed. 
Alternatively, hybridization with allele-specific probes can identify the presence of 
amplified product from either the normal and/or mutant alleles. 

2. Preparation and Detection of Protein for Genetic Screening. 

Where the biological molecule to be analyzed is a protein, it may be desirable 
to release the nucleic acid from biological sample cells prior to protein elution, or to 
remove nucleic acid from the sample eluate prior to protein analysis. Thus, the sample 
or eluate may first be treated to release or remove the nucleic acid by mechanical 
disruption (such as freeze/thaw, abrasion, sonication), physical/chemical disruption, 
such as treatment with detergents (e.g., Triton, Tween, or sodium dodecylsulfate), 
osmotic shock, heat, enzymatic lysis (lysozyme, proteinase K, pepsin, etc.), or 
nuclease treatment, all according to conventional methods well known in the art. 

Where a biological sample includes a mutant protein, the presence or absence 
of which is indicative of a genetic disease, the protein may be detected using 
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conventional detection assays, e.g., using protein-specific probes such as an antibody 
probe. Similarly, where a genetic disease correlates with the presence or absence of an 
amino acid or sequence of amino acids, these amino acids may be detected using 
conventional means, e.g., an antibody which is specific for the native or mutant 
5 sequence (see Table 9 for examples of amino acid sequences present in mutant 
proteins). 

Any of the antibody reagents useful in the method of the present invention 
may comprise whole antibodies, antibody fragments, polyfunctional antibody 
aggregates, or in general any substance comprising one or more specific binding sites 

10 from an antibody. The antibody fragments may be fragments such as Fv, Fab and 
F(ab') 2 fragments or any derivatives thereof, such as a single chain Fv fragments. The 
antibodies or antibody fragments may be non-recombinant, recombinant or 
humanized. The antibody may be of any immunoglobulin isotype, e.g., IgG, IgM, and 
so forth. In addition, aggregates, polymers, derivatives and conjugates of 

15 immunoglobulins or their fragments can be used where appropriate. 

The immunoglobulin source for an antibody reagent can be obtained in any 
manner such as by preparation of a conventional polyclonal antiserum or by 
preparation of a monoclonal or a chimeric antibody. Antiserum can be obtained by 
well-established techniques involving immunization of an animal, such as a mouse, 

20 rabbit, guinea pig or goat, with an appropriate immunogen. 

Preparation of Antibodies 

1. Polyclonal antibodies. 

The peptide or polypeptide may be conjugated to a conventional carrier (e.g. 

25 thyroglobulin) in order to increases its immunogenicity, and antisera to the peptide- 
carrier conjugate is raised in rabbits. Coupling of a peptide to a carrier protein and 
immunizations are performed as described (Dymecki, S.M., et al, J. Biol. Chem 
267:4815-4823, 1992). Rabbit antibodies against this peptide are raised and the sera 
titered against peptide antigen by ELISA or alternatively by dot or spot blotting 

3 0 (Boersma and Van Leeuwen, 1994, Jour. Neurosci. Methods 51:317. At the same 
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time, the antisera may be used in tissue sections. The sera is shown to react strongly 
with the appropriate peptides by ELISA, following the procedures of Green et al., Cell, 
28, 477-487 (1982). The sera exhibiting the highest titer is used in subsequent 
experiments. 
5 2. Monoclonal antibodies. 

Techniques for preparing monoclonal antibodies are well known, and 
monoclonal antibodies of this invention may be prepared using a synthetic peptide, 
preferably bound to a carrier, as described by Amheiter et ah, Nature, 294, 278-280 
(1981). 

10 Monoclonal antibodies are typically obtained from hybridoma tissue cultures 

or from ascites fluid obtained from animals into which the hybridoma tissue was 
introduced. Nevertheless, monoclonal antibodies may be described as being "raised 
to" or "induced by" the synthetic peptides or their conjugates. 

Particularly preferred immunological tests rely on the use of either 

is monoclonal or polyclonal antibodies and include enzyme linked immunoassays 

(ELISA), immunoblotting, immunoprecipitation and radioimmunoassays. See Voller, 
A., Diagnostic Horizons 2: 1-7, 1978, Microbiological Associates Quarterly 
Publication, Walkersville, MD; Voller, A. et al., J. Clin. Pathol. 31:507-520 (1978); 
U.S. Reissue Pat. No. 31,006; UK Patent 2,019,408; Butler, J.E., Meth. Enzymol. 

20 73:482-523 (1981); Maggio, E. (ed.). Enzyme Immunoassay, CRC Press, Boca Raton, 
FL, 1980)' or radioimmunoassays (RIA) (Weintraub, B., Principles of 
radioimmunoassays . Seventh Training Course on Radioligand Assay Techniques, The 
Endocrine Society, March 1 986, pp. 1-5, 46-49 and 68-78). For analyzing tissues for 
the presence of the mutant protein of the present invention, immunohistochemistry 

25 techniques are preferably used. It will be apparent to one skilled in the art that the 
antibody molecule will have to labelled to facilitate easy detection of mutant protein. 
Techniques for labelling antibody molecules are well known to those skilled in the art 
(see Harlour and Lane, Antibodies, Cold Spring Harbour Laboratory, pp 1-726, 1989). 
Alternatively, sandwich hybridization techniques may be used, e.g., an 

3 o antibody specific for a given protein. In addition, an antibody specific for a haptenic 
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group conjugated to the binding protein can be used. Another sandwich detection 
system useful for detection is the avidin or streptavidin system, where a protein 
specific for the detectable protein has been modified by addition of biotin. In yet 
another embodiment, the antibody may be replaced with a non-immunoglobulin 
protein which has the property of binding to an immunoglobulin molecule, for 
example Staphylococcal protein A or Streptococcal protein G, which are well-known 
in the art. The protein may either itself be detectable-fabeled or may be detected 
indirectly by a detectable labeled secondary binding protein, for example, a second 
antibody specific for the first antibody. Thus, if a rabbit-anti-hybrid wild- 
type/nonsense protein antibody serves as the first binding protein, a labeled goat-anti- 
rabbit immunoglobulin antibody would be a second binding protein. 

In another embodiment, the signal generated by the presence of the hybrid 
wild-type/nonsense protein is amplified by reaction with a specific antibody for that 
fusion protein (e.g., an anti-p-galactosidase antibody) which is detectably labeled. 
One of ordinary skill in the art can devise without undue experimentation a number of 
such possible first and second binding protein systems using conventional methods 
well-known in the art. 

Alternatively, other techniques can be used to detect the mutant proteins, 
including chromatographic methods such as SDS PAGE, isoelectric focusing, Western 
blotting, HPLC and capillary electrophoresis. 

Identification of Diseases According to the Invention 

The invention provides methods for identifying diseases caused by or 
associated with at least one RNA molecule having one or more transcript mutations 
which give rise to a frameshift mutation. 

Diseases are identified according to the invention as follows. The nucleotide 
sequence of an RNA molecule suspected of being involved in the pathogenesis of a 
disease is provided, e.g., from published gene sequences or from cloning and 
sequencing of a suspect gene. The amino acid sequence encoded by the RNA is then 
predicted, as are amino acid sequences of encoded mutant proteins. Mutant protein 



29 



WO 98/45322 



PCT/IB98/00705 



sequences are predicted in +1 and +2 reading frames following a hypothesized 
frameshift mutation. The location of the frameshift mutation may be hypothesized 
with respect to certain nucleotide sequence motifs in the RNA molecule, examples of 
such motifs including, but not limited to, GAGA, for example, GAGAC, GAGAG, 
GAGAT, and GAGAA, or CTCT, for example CTCTG, CTCTC, CTCTA and 
CTCTT. 

A probe is then prepared that is specific for the mutant protein or an 
immunogenic fragment thereof (such probes being described hereinabove for detection 
of proteins or protein fragments). Depending on where the mutation that leads to the 
frameshift occurs, part of the mutant protein will have the same sequence as the wild- 
type protein and part of the protein will have the sequence of the mutant protein. 
Furthermore, depending on where the mutation occurs the mutant protein will 
terminate when the nucleotide sequence codes for a stop codon (indicated as * in the 
Figures). Thus, different mutant proteins will be produced depending on where the 
mutation occurs. 

The simplest method of probing for the presence of a particular mutant 
protein is to make an antibody to that protein or an immunogenic portion thereof. An 
immunogenic fragment may be synthesized corresponding to the C-terminus of the 
predicted mutant proteins because even if the mutation occurred at another position in 
the sequence, the probability that the derived mutant protein contains the peptide 
sequence is increased. For example, in the f}-App encoding RNAs, two different 
transcript modifications have occurred (i.e., at two different GAGA motifs) which 
result in two frameshifted proteins having identical C-terminal sequences. 
Furthermore, the C-terminal region of a protein is more likely to form an epitope than 
other regions of the protein. 

Once a probe is made, a biological sample from a patient having the disease 
and a biological sample from a patient not having the disease is probed for the 
presence or absence of the mutant protein, also as described above. Alternatively, 
several probes may be prepared and the combination of probes used to probe the tissue 
sample. The presence of the mutant protein in a biological sample from a patient 
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having the disease and the absence of said mutant protein in a biological sample from 
a patient not having the disease indicates that the mutant protein is a marker for the 
disease or susceptibility to the disease. 

5 Treatment of Diseases According to the Invention 

The invention also relates to methods for preventing and/or treating diseases, 
vectors for preventing and/or treating the diseases, and compositions such as nucleic 
acid sequences and proteins for preventing and/or treating the diseases, which methods 
and compositions are useful in gene and protein therapies. 
10 The invention includes methods of treatment and/or prevention of a disease 

caused by or associated with an RNA having a mutation in GAGA or CTCT giving 
rise to a frameshift mutation in which a ribozyme, a wild-type RNA, or both, an RNA 
or DNA that is complementary to the mutant RNA and capable of forming a hybrid 
with the mutant RNA, or a vector comprising a sequence encoding any of these 
15 sequences, or the wild-type form of a mutant protein, is administered to a patient 
suffering from or susceptible to the disease. 

Preferred diseases which are treated according to the invention include but 
are not limited to cancer or a neurodegenerative disease, especially AD, the preferred 
mutant RNAs of the present invention are those encoding the P amyloid precursor 
20 protein, the Tau protein, ubiquitin B, apolipoprotein-E 4 (Apo-E 4 ), micro-tubule 
associated protein II (MAP 2), the neurofilament proteins (L, M, H), presenilin I, 
presenilin II, Big Tau, GFAP, P53, BCL2, semaphorin III, HUPF, HMG and NSP-A, 
having a deletion, insertion or other modification in the RNA leading to a frameshift 
mutation. 

25 Ribozymes useful in treatment according to the present invention are 

preferably hammerhead ribozymes. 

A pharmaceutical composition according to the invention will include a 
therapeutically effective amount of a ribozyme, the wild-type analog of the mutant 
RNA, or both, or a DNA or RNA that is complementary to the mutant RNA and 

3 0 capable of forming a hybrid untranslatable sequence in vivo, in admixture with a 
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carrier. A therapeutically effective amount is considered that amount which, when 
administered to a patient, provides a therapeutic benefit to the patient. Such amounts 
will generally be in the range of 10 ug-100 mg of therapeutic protein/kg body weight 
of the patient, preferably 50 ug-10 mg, and most preferably 100 ug-1 mg. 

Where vectors are useful according to the invention, the vector may be of 
linear or circular configuration and may be adapted for episomal or integrated 
existence in the host cell, as set out in the extensive body of literature known to those 
skilled in the art. The vectors may be delivered to cells using viral or non- viral 
delivery systems. The choice of delivery system will depend on whether the substance 
is to be delivered to a selected central nervous system or neuronal cell type or 
generally to these cells. 

Vectors of the present invention additionally may comprise further control 
sequences such as enhancers or locus control regions (LCS), in order to lead to more 
controlled expression of the encoded gene or genes. LCS are described in EP-A- 
0332667. The inclusion of a locus control region (LC), is particularly preferred as it 
ensures the DNA is inserted in an open state at the site of integration, thereby allowing 
expression of the gene or genes contained in the vector. The vectors of the present 
invention have a wide range of applications in ex vivo and in vivo gene therapy. 

Animal Models for Disease Diagnosis 
and Treatment According to the Invention 
The invention also includes stable cell lines and transgenic animals for use as 
disease models for testing or treatment. 

A stable cell line or transgenic animal according to the invention will contain 
a recombinant gene or genes, also known herein as a transgene, encoding one or more 
mutations giving rise to a frameshift mutation which causes or is associated with a 
disease. 

The recombinant gene will encode an RNA encoding a mutated protein found 
to be indicative of a disease. Preferably, the mutant protein will contain an antigenic 
epitope specific for the diseased state. The recombinant gene may encode a protein 
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comprising at least part of the sequence designated +1 or +2 in any one of Figures 2 to 
9, or an immunologically equivalent fragment thereof. 

A cell line containing a transgene encoding a mutant protein, as described 
herein, is made by introducing the transgene into a selected cell line according to any 
5 one of several procedures known in the art for introducing a foreign gene into a cell. 

A transgenic animal containing such a transgene includes a rodent, such as a 
rat or mouse, or other mammals, such as a goat, a cow, etc. and may be made 
according to procedures well-known in the art. 

Transgenic animals are useful according to the invention as disease models 
io for the purposes of research into diseases caused by or associated with at least one 
gene encoding an RNA containing one or more mutations giving rise to a frameshift 
mutation, and therapies therefore. By specifically expressing one or more mutant 
genes, as defined above, the effect of such mutations on the development of the 
disease can be studied. Furthermore, therapies including gene therapy and various 
15 drugs can be tested on the transgenic animals. 

Recombinant genes introduced into an animal to make a transgenic animal 
useful in the invention will include those genes specifically disclosed herein, 
containing a dinucleotide deletion or insertion relative to the wildtype sequence of the 
gene, the dinucleotide deletion or insertion being associated with the nucleotide 
20 sequence GAGA or CTCT; for example GAG AX or CTCTX, where X is one of G, A, 
T or C; such as GAGAG, GAGAC, GAGAT and GAGAA or CTCTG, CTCTC, 
CTCTT and CTCT A. Such transgenes will preferably contain a dinucleotide deletion 
which is an AG deletion or a GT deletion just adjacent to GAGAG (Figure 20), for 
example, one or two dinucleotide deletions associated with a GAGA, GAGAG, 
2 5 GAG AC, GAGAT, GAGAA leading to a + 1 or + 2 frameshift mutation respectively. 
In a similar manner, CTCTX can undergo the same deletion process (ACT). 

Recombinant transgenes containing such a mutation which are particularly 
useful in animal models of disease include those associated with neurodegenerative 
diseases, especially Alzheimer's disease, and include but are not limited to mutant 
30 gene sequences disclosed herein encoding mutant p amyloid precursor protein, the 
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Tau protein, ubiquitin B, apolipoprotein-E 4 (Apo-E 4 ), microtubule associated protein II 
(MAP 2), the neurofilament proteins (L, M, H), presenilin I, presenilin II, Big Tau, 
GFAP, P53, BCL2, semaphorin HI, HUPF, HMG and NSP-A (see also Tables 2-8). 

It also is contemplated that transgenic animals of the invention may contain 
5 transgenes that are controlled via a regulatable and/or a regulated promoter such that 
the corresponding wildtype protein is expressed during selected stages of 
development and maturity of the animal and in a selected tissue, and the mutant gene 
is turned-on when desired. This is particularly desirable where the animal model is of 
Alzheimer's disease, wherein the mutant protein begins to be expressed later in life of 
10 the animal. Thus, if the mutant gene is under the control of a brain-specific inducible 
promoter, e.g., a neurofilament, aldolase or modified Thy-1 promoter, then onset of the 
disease may be controlled via expression of the mutant gene. 

Transgenic animals according to the invention may be generated to over-express a) 
human P amyloid precursor protein +1, b) human ubiquitin +1 proteins, c) human 
15 neurofilament proteins. 

Described below is an embodiment of the invention involving identification 
of transcript frameshift mutations in RNA molecules encoding proteins which are 
present in neuronal tissue, and how such mutations are useful in diagnosis of certain 
20 disease states, 

-The cDNA sequences coding for the human P amyloid precursor protein, 
Tau, ubiquitin, apolipoprotein E4, MAP 2, the neurofilament subunits low, medium 
and high, presenilin I, presenilin II, Big Tau, GFAP, P53, BCL2, semaphorin III, 
HUPF-I, HMG and NSP-A were obtained from various gene sequence databases. 
25 Using the sequence data, the various GAGA or CTCT motifs in the 

sequences were identified, and deletions were hypothesized and the sequences of the 
derived mutant proteins predicted, as shown in Figures 2-19. Both the sequences of the 
+1 and +2 frameshift mutant proteins were predicted. 

By examining the sequences of the hypothesized mutant proteins, a peptide 
3 o corresponding to the C-terminus of the hypothesized mutant proteins was synthesized. 
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The peptides were synthesized using standard techniques known to those skilled in the 
art. The peptides having the following sequences were synthesized: RGRTSSKELA 
[SEQIDNO: 1]; HGRLAPARHAS [SEQ ID NO: 2]; YADLREDPDRQ [SEQ ID 
NO: 3]; RQDHHPGSGAQ [SEQ ID NO: 4]; YADLREDPDRQDHHPGSGAQ [SEQ 
ID NO: 1400];GGGAQ[SEQIDNO: 5], GAPRLPPAQAA [SEQ ID NO: 6]; 
KTRFQRKGPS [SEQ ID NO: 7]; PGNRSMGHE [SEQ ID NO: 8]; EAEGGSRS 
[SEQ ID NO: 9]; VGAARDSRAA [SEQ ID NO: 10]; HDYPPGGSV [SEQ ID NO: 
11]; SIQKFQV [SEQ ID NO: 12]; VEKPGERGGR [SEQ ID NO: 13]; 
PLFGRGHKRG [SEQ ED NO: 14]; EDRGDAGWRGH [SEQ ID NO: 15); 
QERGASPRAAPREH [SEQ ID NO: 16]; RQPGDVAPGGQHRPVDD [SEQ ID 
NO: 17]; AGLLATPEAK [SEQ ID NO: 18]; YVDVYNGGKFS [SEQ ID NO: 19]; 
AADERRCHLLHMCGRR [SEQ ID NO: 20; QQATEAGQHYQPGSPLHDHSHV 
[SEQIDNO: 21]; PQEAAARTNR [SEQ ID NO: 22]; RSWVHPAPPYQMCLG 
[SEQIDNO: 23]; and GGSRTHPR [SEQ ID NO: 24]. 

Depending on where the mutation that leads to the frameshifl occurs, part of 
the mutant protein will have the same sequence as the wild-type protein and part of the 
protein will have the sequence of the mutant protein. Furthermore, depending on 
where the mutation occurs the mutant protein will terminate when the nucleotide 
sequence codes for a stop codon (indicated as * in the Figures). Thus different mutant 
o proteins will be produced depending on where the mutation occurs. 

- It is predicted that mutations will occur at GAGA or CTCT motifs in the 
cDNA and the sequences of the mutant proteins predicted accordingly. 

Peptides were synthesized corresponding to the C-terminus of the predicted 
mutant proteins because even if the mutation occurred at another position in the 
5 sequence the probability that the derived mutant protein contains the peptide sequence 
is increased. Furthermore, the C-terminus region of a protein is more likely to form an 
epitope than other regions of the protein. 

The uniqueness of the synthesized peptides was confirmed by a gene 
sequence database search, 
o Each synthesized peptide was then injected into a rabbit and an antibody 
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having affinity for the peptide purified. The techniques used to obtain the antibodies 
are standard techniques known to those skilled in the art. 

The antibodies obtained were then tested on autopsy material of frontal 
cortex, temporal lobe and hippocampus of neuropathologically confirmed AD cases 
5 and control non-AD cases. The presence of the antibodies is determined using 
standard detection methods known to those skilled in the art. 

Figure 1 shows the presence of the p amyloui precursor mutant protein 
(P APP +I ) in the frontal cortex of an Alzheimer patient identified using an antibody 
against a peptide predicted by the +1 reading frame of P amyloid precursor protein. 
10 The antibody used had affinity for a peptide having the following sequence 
RGRTSSKELA [SEQ ID NO: 1]. 

The results of other immunoreactive tests performed using the antibodies 
against the predicted peptides are shown in Tables 2-5. 



is correlates with the subject having AD. The presence of one or more of the mutant 
proteins can therefore be seen to be indicative of AD. 

Table 6 summarizes the immunoreactivity results within the frontal cortex 
(area 1 1), temporal cortex (area 38) and the hippocampus. 



20 as defined herein. For example, seven patients with Downs' syndrome were tested 
according" to the invention. Downs' syndrome is trisomy of chromosome 21 which 
leads to over-expression of P -amyloid precursor protein. We noted that the frontal 
and temporal cerebral cortex and hippocampus of these patients contained plaques and 
neurofibrillary tangles, and hypothesized that such over-expression may promote 

25 accumulation of transcript mutations in neurons, by frameshift mutations at a GAGAG 
motif in the over-expressed P-amyloid gene. After irnmunocytochemical staining of 
tissue from frontal and temporal cerebral cortex from the Downs' patients with the 
above-described antibody specific for the amyloid +1 carboxy terminal peptide, 
immunoreactivity was observed in the neurofibrillary tangles in six of seven patients. 

3 0 Staining was absent in the frontal cortex of the matched controls. Therefore, the 



It can be seen that the presence of the mutant protein can be detected and 



Other diseases also may be correlated with the presence of mutant proteins, 
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mutant amyloid protein is correlated not only with Alzheimer's disease, but also with 
other diseases, such as Downs', involving Alzheimer's neuropathology. 

It has been found that a number of the mutations occur at GAGA or CTCT 
motifs. Table 7 shows the presence of the complementary GAGA motifs in various 
5 cDNAs of the neuronal system. The motif or, as can be seen from the sequences of 
Tau and apolipoprotein E4, similar motifs such as GAGAG GAGAC, GAGAA, and 
GAGAT (in the cDNA) may be associated with the frameshift mutations that lead to or 
are associated with the disease. The presence of the motif or similar motifs in other 
RNA molecules may indicate that they are relevant to a disease. It is also possible that 
o other mutations occur that are not associated with such motifs but still lead to 
frameshift mutations that cause or are associated with a disease. 

Table 8 shows the presence of GAGAC motifs in particular RNA molecules 
of the neuronal system, namely £APP, Tau and Ubiquitin. This table also indicates, 
inter alia, the chromosomal location of the genes from which the mutant RNA 
5 molecules are transcribed and the molecular weight of the longest polypeptide forms 
encoded by the RNA molecules and the predicted size of the aberrant +1 peptide with 
its C-terminus against which the antibodies were raised. These peptides were revealed 
in a Western blot and also identified with a different antibody recognizing an epitope 
on the unaffected wild-type N-terminus. 
o EXAMPLE 2 

Selection "of Antigenic Peptide 

Synthetic polypeptides corresponding in sequence to a portion of a mutant 
protein (whether such peptides are chemically synthesized or are chemically or 
recombinantly generated fragments of a protein), as described herein, will be useful 
5 according to the invention as antigenic peptides for generation of antibodies specific 
for a mutant protein, provided they possess the following characteristics. The 
synthetic peptide will include a minimum of 8 and preferably 12-15 amino acid 
residues, and an optimum length of 20-21 amino acids. The hydrophilicity and 
antigenic index of the amino acid sequence of the hybrid wild-type/nonsense protein 
3 0 may be determined by Analytical Biotechnology Sciences, Boston, MA, using 
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computer programming. Potential synthetic peptides useful according to the invention 
include a stretch of 12-20 amino acids preferably within the carboxy terminal 100-150 
amino acids of the hybrid wild-type/nonsense protein. 

The amino acid sequence of a selected peptide is searched in a computer 
database of sequences (e.g., GenBank) to preclude the possibility that at reasonable 
concentrations, antisera to any one peptide would specifically interact with any protein 
of a known sequence. Preferred sequences are those which are determined not to have 
a close homolog (i.e., "close" meaning 80-100% identity). 

EX AMPLE 3 

Detection of "Mutant" Protein 

Another embodiment of this invention relates to an assay for the presence of 
the "mutant" or mutant protein in a given tissue as indicative of a disease state. Here, 
an above-described antibody is prepared. The antibody or idiotype-containing 
polyamide portion thereof is then admixed with candidate tissue and an indicating 
group. The presence of the naturally occurring amino acid sequence is ascertained by 
the formation of an immune reaction, as signalled by the indicating group. Candidate 
tissues include any tissue or cell line or bodily fluid to be tested for the presence of the 
mutant protein, as described hereinabove. 

Expression of a given hybrid wild-type/nonsense protein may be investigated 
using antiserum prepared in rabbits against a peptide corresponding to a carboxy 
terminal stretch of amino acids in the hybrid wild-type/nonsense protein as follows. 

CMK cells or U3T3 cells are metabolically labeled with 35 S-methionine and 
extracts are immunoprecipitated with antiserum. If the hybrid wild-type/nonsense 
protein is present in the cells, then a protein species of corresponding molecular weight 
s will be detected in CMK and U3T3 cells. The protein may be localized to the 

membrane, nucleus or cytoplasm by Western blot analysis of the nuclear, membrane 
and cytoplasmic fractions, as generally described in Towbin et al., Proc. Natl. Acad. 
Sci. USA, 76, 4350-4354 (1979). This localization may be confirmed by 
immunofluorescence analysis to be associated mainly with the plasma membrane, 
o Metabolic labeling immunoprecipitation, and immunolocalization assays are 
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performed as described previously (Furth, M.E., et aL, Oncogene 1:47-58, 1987; 
Laemmli, U.K., Nature 227:680-685, 1970; Yarden, Y M et al., EMBO J. 6:3341-3351, 
1987; Konopka, J.B., et al., Mol. Cell. Biol. 5:31 16-3123, 1985). For immunoblot 
analysis, total lysates are prepared (using Fruth's lysis buffer) (Fruth, M.E., et al., 
5 Oncogene, 1 :47-58, 1987). Relative protein concentrations are determined with a 
colorimetric assay kit (Bio-Rad) with bovine serum albumin as the standard. A 
protein of lysate containing approximately 0.05 mg ef protein was mixed with an 
equal volume of 2 x SDS sample buffer containing 2 mercaptoethanol, boiled for 5 
min., fractioned on 10% polyacrylamide-SDS gels (Konopka, J.B., et al., J. Virol., 

io 51:223-232, 1984) and transferred to immunobilon polyvinyldine difluoride (Millipore 
Corp., Bedford, MA) filters. Protein blots were treated with specific antipeptide 
antibodies (see below). Primary binding of the specific antibodies may be detected 
using anti-IgG second antibodies conjugated to horseradish peroxidase and subsequent 
chemiluminescence development ECL Western blotting system (Amersham 

is International). 

For metabolic labeling, 10 6 cells are labeled with 100 jiCi of 35 S-methionine 
in 1 ml of Dulbecco's modified Eagles medium minus methionine (Amersham Corp.) 
for 16 h. Immunoprecipitation of protein from labeled cells with antipeptide antiserum 
is performed as described (Dymecki, S.M., et al., J. Biol. Chem 267:4815-4823, 1992). 

20 Portions of lysates containing 10 7 cpm of acid-insoluble 35 S-methionine were 
incubated with 1 ^g of the antiserum in 0.5 ml of reaction mixture. 
Immunoprecipitation samples were analyzed by SDS-polylarcylamide gel 
electrophoresis and autoradiography. 

For immunolocalization studies, 10 7 CMK cells are resuspended in 1 ml of 

25 sonication buffer (60 mM Tris-HCl, pH 7.5; 6 mM EDTA, 15 mM EGTA, 0.75 M 
sucrose, 0.03% leupeptin 12 mM phenylmethylsulfonyl fluoride, 30 mM 2- 
mercaptoethanol). Cells are sonicated 6 times for 10 seconds each and centrifuged at 
25,000 xg for 10 min at 4°C. The pellet is dissolved in 1 ml of sonication buffer and 
centrifuged at 25,000 x g for 10 min at 4°C. 

3 0 The pellet (nucleus fraction) is resuspended in 1 ml of sonication buffer and 
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added to an equal volume of 2 x SDS sample buffer. The supernatant obtained above 
(after the first sonication) is again centrifiiged at 100,000 x g for 40 min at 4°C. The 
supernatant (cytosolic fraction) is removed and added to an equal volume of 2 x 
concentrated SDS sample buffer. The remaining pellet (membrane fraction) is washed 
5 and dissolved in sonication buffer and SDS sample buffer as described above. Protein 
samples are analyzed by electrophoresis on 10% polyacrylamide gels, according to the 
Laemmli method (Konopka, J.B., et al., Mol. Cell. Biol. 5:31 16-3123, 1985). The 
proteins are transferred from the gels on a 0.45-^im polyvinylidine difluoride 
membrane for subsequent immunoblot analysis. Primary binding of antibodies is 

10 detected using anti-IgG second antibodies conjugated to horseradish peroxidase. 

For immunohistochemical localization of a given protein, if desired, CMK 
cells or U3T3 are grown on cover slips to approximately 50% confluence and are. 
washed with PBS (pH 7.4) after removing the medium. The cells are prefixed for 1 
min at 37°C in 1% paraformaldehyde containing 0.075% Triton X-100, rinsed with 

is PBS and then fixed for 10 min with 4% paraformaldehyde. After the fixation step, 

cells are rinsed in PBS, quenched in PBS with 0.1 and finally rinsed again in PBS. For 
antibody staining, the cells are first blocked with a blocking solution (3% bovine 
serum albumin in PBS) and incubated for 1 h at 37°C. The cells are then incubated 
for 1 h at 37 °C with antiserum (1:100 dilution or with preimmune rabbit serum 

20 (1:1 00) (see below). After the incubation with the primary antibody, the cells are 
washed in PBS containing 3% bovine and serum albumin and 0.1% Tween 20 and 
incubated for 1 hour at 37 °C in a fluorescein-conjugated donkey anti-rabbit IgG 
(Jackson Immunoresearch, Maine), diluted 1:100 in blocking solution. 

The coverslips are washed in PBS (pH 8.0), and glycerol is added to each 

25 coverslip before mounting on glass slides and sealing with clear nail polish. All glass 
slides were examined with a Zeiss Axiophot microscope. 

EXAMPLE 4 

Biological Sam ple Analysis 

The above methods for detection of a given mutant protein or nucleic acid are 
3 0 applicable to analyses involving tissues, cell lines and bodily fluids (e.g. cerebrospinal 
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liquor or blood, including but not limited to venous, arterial and cord blood) suspected 
of containing the marker protein. 

For example, a sample of CNS tissue suspected of being in a diseased state 
may be analyzed, it having been previously observed according to the invention that 
tissue of that particular diseased state contains detectable levels of hybrid wild- 
type/nonsense proteins relative to healthy tissue. 

An aliquot of the suspect sample and a healthy control sample are provided 
and admixed with an effective amount of an antibody specific for the hybrid wild- 
type/nonsense protein, as herein described, and an indicating group. The admixture is 
typically incubated, as is known, for a time sufficient to permit an immune reaction to 
occur. The incubated admixture is then assayed for the presence of an immune 
reaction as indicated by the indicating group. The relative levels of the hybrid wild- 
type/nonsense protein in the suspect sample and the control sample are then compared, 
allowing for diagnosis of a diseased or healthy state in the suspect sample. 

The above types of analyzing for the presence of the hybrid wild- 
type/nonsense protein may, of course, be performed using analysis for the coding 
RNA, e.g., via Northern blot or RNA dot blot analysis, both of which are conventional 
and known in the art. 

Disease Treatment According to the Invention 
* Disease treatment according to the invention contemplates eliminating 
mutant transcripts. Evidence supporting the presence of transcript mutant RNA 
molecules is that in homozygous Brattleboro hypothalamus cells, vasopressin cDNAs 
having the firameshift mutation were observed in 1 in 100 colonies, where as genomic 
vasopressin DNA having the frameshift mutation was not identified. 

In the human hypothalamus no age related increase in the number of 
vasopressin +1 immunoreactive cells is observed (contrary to that in rat). However, in 
the fetal period (29-42 weeks of gestation) an enormous increase in the number of +1 
immunoreactive cells containing the +1 vasopressin protein is detected. After birth, 
the number of these cells falls back to just a few. In Downs' syndrome, where papp 
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gene expression is very high (5-fold higher than normal), the highest levels of Papp 
and +1 mutant proteins were also observed, higher than in AD where papp gene 
expression is not found to be increased over normal levels. It also has been found that 
p APP+1 and UbiB+1 mutant proteins coexist (and are present in tangles and 
5 dystrophic neurites) in the same cell. Accordingly, is unlikely that the transient 
increase is due to a genomic event. 

Once an RNA molecule containing a frameshift mutation (i.e., a frameshifted 
transcript), or a mutant protein is correlated with a disease state, the disease is treatable 
according to the invention as follows: by administering to a patient in need thereof 

10 enzymes which serve to selectively eliminate frameshifted RNA via cleavage, e.g., 
ribozymes; by administering the wild-type version of the mutant RNA, preferably in 
substantially uncleavable form, by administering the wild-type version of the hybrid 
wild-type/nonsense protein; or by administering oligonucleotides or sequences 
encoding oligonucleotides complementary to a mutant RNA to a cell in order to form a 

is nucleic acid duplex which renders the mutant RNA untranslatable. 

A patient in need thereof will include a patient exhibiting symptoms of the 
disease, even those patients suspected of developing the disease, i.e., who are 
monitored according to the invention by measuring the a tissue sample, e.g., the 
cerebrospinal fluid, for the presence of frameshifted peptides (e.g. peptides having an 

20 amino acid sequence in the +1 or +2 reading frame), 

- According to the invention, a ribozyme may be delivered to affected or 
susceptible cells leading to the cleavage of the mutant RNA and resultant inability of 
the cell to translate the mutant RNA into mutant protein. The wild-type protein, if not 
already produced by the cell, may be provided in protein form or via administering the 

2 5 wild-type RNA to the cell along with the ribozyme. The wild-type RNA may be 

engineered so as to contain a sequence that is distinguishable from the mutant RNA 
sequence other than simply at the level of the GAGA or CTCT mutation. For 
example, the mutant RNA may contain third base silent mutations, i.e., which do not 
change the coding sequence of the RNA, but which render the wild-type RNA less or 

3 0 substantially unsusceptible to cleavage by the ribozyme. 
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Without being bound by any one theory, it is suggested that decreasing the 
percentage of mutant RNA and increasing the percentage of the correct protein 
produced in relation to the hybrid wild-type/nonsense protein will reduce or prevent 
further progression of the disease, and possibly reverse the diseased state. In addition, 
5 it is possible that not every mutant transcript results in a mutant protein that is 

directly toxic to the neuronal tissue. For example, the mutant protein may be routed to 
the proteasomal and/or lysosomal system or just secreted (e.g. by the constitutive or 
regulated pathway)and degraded elsewhere. However, sometimes the mutant protein 
will be accumulated in the membranes of organelles, for instance in the endoplasmic 
10 reticulum, thus disrupting the normal processes of the cellular machinery. 

The wild-type version of the mutated RNA encodes the correct protein. 
When the disease is a neurodegenerative disease, preferred wild-type sequences 
include the RNAs encoded by the P amyloid precursor protein gene, the Tau gene, 
the ubiquitin B gene, the apolipoprotein-E 4 gene, the microtubule associated protein II 
is (MAP2) gene, the neurofilament protein genes (L, M and H), the presenilin I and II 

genes, Big Tau, GFAP, P53, BCL2, Semaphorin III, HUPF-1, HMG and NSP-A. The 
sequences of these genes are provided herein in the figures. Other preferred wild-type 
RNAs are encoded by the alpha and beta tubulin genes, the sequences of which are 
found in Cowan et ah, Mol. Cell. Biol., 3, 1738-1745(1983) and Lewis et al., J. Mol. 
20 Biol. 182, 11-20(1985), respectively. 

-When the disease is a non-hereditary cancer, preferred wild-type RNAs are 
encoded by gene sequences which include but are not limited to the human p53 gene 
and the BCL-2 gene. Mammalian phosphoprotein p53 has been shown to play an 
essential role in regulation of cell division and is required for the transition from phase 

2 5 GO to Gl of the cell cycle. P53 is normally present in very low levels in normal cells 

and is believed to be a tumor suppressor gene; when present at high levels, p53 has 
been shown to play a role in transformation and malignancy. P53 gene alleles from 
normal and malignant tissues have been shown to contain Bglll site polymorphism 
(Buchman et aL, 1988, Gene 70:245). The p53 coding region contains several GAGA 

3 0 motifs, e.g., GAGAC at position 1476 of the sequence published in Buchman et al., 
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GAGA at position 1498; GAGA at position 1643; and GAGA at position 1713, which 
motifs present candidate sites for frameshift RNAs according to the invention. A 
frameshift mutation within a p53 RNA thus may lead to loss of the natural p53 tumor 
suppressor function. Detection of such a mutation in p53 may be diagnostic of pre- 
5 malignancy or malignancy, and treatment as described herein which results in 
correction of p53 function may restore tumor suppressor function. 

In Diabetes mellitus type II, which occurs with increased frequency in aged 
persons, the islands of Langerhans degenerate possibly as a result of frameshift 
mutations in various transcripts (e.g., the ubiquitin transcript). 

10 The invention also encompasses methods of combatting diseases caused by at 

least one an RNA having one or more GA, GT or CT deletions giving rise to a 
frameshift mutation by targeting the RNA transcript. Thus, it is also contemplated 
according to the invention that a frameshift mutation within an RNA may be corrected 
at the level of the frameshifted RNA via cleavage using a ribozyme having specificity 

15 for the mutant RNA sequence (see Denman et al., Arch. Of Biochem. Biophysics, 

323,71-78,1995), and eliminating the mutant mRNA. The disease associated with the 
frameshifted RNA is thus treated by administering an appropriate ribozyme, or 
sequences encoding the ribozyme, to the patient. 

Ribozymes of selected specificities may be made as described by Sullenger 

20 & Cech, Nature 371: 619-622, 1994), herein incorporated by reference. Ribozymes 
and sequences encoding ribozymes may be prepared as described by Tuschl et al., 
Curr. Opin. Struc. Biol. 5:296, 1995 and Wahl et al., Curr. Opin. Struc. Biol. 5:282, 
1995. 

The invention also encompasses methods of treating diseases caused by or 
25 associated with at least one an RNA having one or more GAGA or CTCT mutations 
giving rise to a frameshift by delivery of complementary oligonucleotides or 
sequences encoding complementary oligonucleotides to a target cell containing a 
frameshifted RNA. The oligonucleotides will have a mutant sequence with respect to 
the region of the mutant RNA containing the GA, GT or CT deletion, and thus may 
3 0 serve to form a hybrid in vivo with the mutant RNA, rendering it untranslatable. 
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Oligonucleotides with strong target site binding affinity, i.e., with full target site 
homology are preferred. Also preferred are oligonucleotides between 10-30 
nucleotides in length and containing a CTCT, CTCTG, CTCTC, CTCTT or CTCTA 
motif or a GAGA, GAGAG, GAGAC, GAGAT or GAGAA motif, 
s Disease treatment according to the invention is described below and includes 

preparation of the administered substance and administration of the substance to a 
patient suffering from a disease according to the invention. As used herein 
"substance" refers to any one of the following: a ribozyme, a nucleic acid sequence 
encoding a ribozyme, a wild-type transcript, an antisense mutant RNA or DNA or a 
10 nucleic acid sequence encoding an antisense sequence, a wild type protein encoded by 
the wild type gene, an antibody specific for the firameshifted (nonsense) protein. 

Disease treatment according to the invention may be accomplished as 
follows. In Example 5, treatment using ribozymes according to the invention is 
described. 

15 In Example 6, administration of vectors is described. In Example 7, 

administration of proteins, ribozymes, or nucleic acid vectors using liposomes is 
described. In Example 8, delivery of these substances across the blood-brain barrier is 
described. Lastly, in Example 9, methods ofdelivering cells comprising a protein, 
ribozyme or other nucleic acid, such as a vector bearing a gene expression construct, 

20 are described. 

EXAMPLE 5 

Treatment According to the Invention Using Ribozvmes 

Preparation and delivery of a ribozyme or nucleic acid sequence encoding a 
ribozyme which effect or facilitate selective removal of the frameshifted RNA is 
25 carried out as follows. 

Selec fjvs Elimination of mutant transcripts According to the Invention. 
The invention thus also encompasses methods of treating diseases caused by 
the translation of frameshifted mRNA's which are the result of transcriptional 
infidelity occurring at or adjacent to GAGA or CTCT motifs in the p-APP and 
30 ubiquitin B genes. It is believed that accumulation of aberrant proteins encoded by 
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these messages contributes to the progression of Alzheimer's disease; therefore, 
elimination of the mutant transcripts is of therapeutic value. It is contemplated 
according to the invention that the mutant transcripts described herein are rendered 
untranslatable in a cell via ribozyme-mediated cleavage using ribozymes designed and 
administered as described herein. In addition, it may be advantageous in certain 
circumstances to replace the ribozyme-cleaved messages with an exogenous transcript 
encoding the wild-type protein, which transcript is cleavage-resistant and the synthesis 
of which is therefore not subject to transcriptional errors or post-transcriptional 
modification such as those that produced the mutant transcripts described herein. 

Treatment strategies are described below for selective elimination of the 
ubiquitin B and P-APP mutant transcripts in cells. The invention, however, also 
contemplates selective removal of other mutant transcripts, whether disclosed herein 
or later-discovered, according to the methods described hereinbelow. 

Ribozymes of the hammerhead class are the smallest known, and lend 
themselves both to in vitro synthesis and delivery to cells (summarized by Sullivan, J, 
Invest. Dermatol 103: 85S-98S, 1994; Usman et ai, Curr. Opin. Struct. Biol 6: 527- 
533, 1996). It is required of hammerhead cleavage targets that they comprise the 
sequence motif UH, wherein H denotes the ribonucleotides A, U, or C, but not G; the 
sequence is cleaved following the H. The core functional unit of the hammerhead 
ribozyme is a tripartite structure made up of helix I, which hybridizes to mRNA 
sequences 3" of the cleavage site, helix II, a 22 ribonucleotide catalytic domain which 
mediates the cleavage reaction, and helixIII, which hybridizes to sequences 5' of- and 
including the "IT* of the UH cleavage motif (Haseloff and Gerlach, Nature 334: 585- 
591, 1988; Ruffeer et aL Biochemistry 33: 10695-10702, 1990). Studies have shown 
that the lengths of helices I and III are proportional to the efficiency with which 
ribozymes both bind the area surrounding the cleavage site and release themselves 
from it following cleavage; the former is critical for target recognition, while the latter 
is important for maintaining kinetics that indicate true catalytic activity, namely, 
raising the ratio of target molecules inactivated to ribozymes above the 1:1 
stoichiometric ratio observed with antisense-RNA-mediated inactivation. Ideally, 
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helix I is 3 to 5 ribonucleotides in length, relative to 9 to 13 ribonucleotides for helix 
III (Tabler et al t Nucleic Acids Res. 22: 3958-3965, 1994; Hendry and McCall, 
Nucleic Acids Res. 24: 2679-2684, 1996). Other factors, such as stem loop formation 
in the unbound ribozyme and target mRNA also play a role in reaction kinetics and 
5 ribozyme stability, and in order to predict and/or compensate for such interactions, 
molecular modeling studies and in vitro trials of numerous ribozyme designs are often 
undertaken (Sioud et al. t Nucleic Acids Res. 22: 5571-5575, 1994; Gavin and Gupta, J. 
Biol Chem. 272: 1461-1472, 1997). 

Mutant ubiquitin B transcripts can be removed from cells via the application 
lo of hammerhead ribozymes delivered to these cells in liposome vectors. The invention 
comprises use of these ribozymes to recognize and cleave mutant transcripts at a site 5* 
to the GAGA or GGT-containing site that is the source of polymerase slippage during 
transcription, thereby ridding the cells of the frame-shifted portion of the translated 
products of these defective messages. 
15 The sequence immediately preceding the GAGA motif in the ubiquitin 

transcript is GCGUCU, which includes the cleavage recognition motif UC. Given the 
length considerations posed above, the sequence ideally bound by helix I for a 
particular mutant is GAG; however, in that such a ribozyme would be expected to bind 
the mutant and wild-type transcripts with equal efficiency, helix 1 must be lengthened 
20 to include four more nucleotides, such that all seven bases will hybridize to the mutant 
transcript while sufficient mismatch to destabilize binding to the wild-type sequence 
will result. This strategy is more efficient in cases in which the mutant transcript has 
arisen via deletion rather than insertion, since in the latter, the effect of the absolute 
length of helix I on target release becomes a concern; however, delivery of a pool of 
2 5 differentially-designed ribozymes complementary to various mutant sequences that 
can result from imprecise transcription of the GAGA motif or GGT sequence in the 
case of ubiquitin and of sufficient mismatch with the wild-type sequence to inhibit 
efficient binding of a ribozyme to it should eliminate translation of the frame shifted 
products of a large proportion of defective messages. 
30 Such a strategy is practical in situations in which the cleavage site is 1 to (at 
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most) 5 bases to the 5* side of the cleavage site; however, longer distances require 
accordingly longer helix I binding domains which, combined with the need to create 3* 
mismatches for differentiation between mutant and wild-type transcripts, make such an 
approach inadequate for dealing with certain mutations. This is true of GAGA- 

5 defective transcripts of P-APP. While one GAGA motif is separated from the 

cleavage site by a single base, the remaining four motifs are between 7 and 20 bases 
from the nearest 5* cleavage site. In such a case, cleavage of both of the wild-type 
transcript via hammerhead ribozymes may be unavoidable, and its replacement with a 
cleavage-resistant transcript must be undertaken in concert with removal of mutant 

10 transcripts (see Table 10 for possible sequence substitutions resulting in a cleavage- 
resistant transcript); here, the ribozyme is designed to cleave both types of message at 
any UH site 5' of the first GAGA motif. 

It may be advantageous to replace the P-APP transcript in all cells in which 
it is needed; therefore, co-delivery of an expression vector bearing a spliced p-APP 

15 minigene driven by p-APP promoter sequences may be employed. Numerous studies 
of this promoter have been undertaken (among them, Lahiri and Robakis, Brain Res. 
MoL Brain Res. 9: 253-7, 1991; Bourbonniere and Nalbantoglu, Brain Res. MoL Brain 
Res. 19: 246-250, 1993; Lukiw et ai. Brain Res. MoL Brain Res. 22: 121-131, 1994; 
Lahiri and Nail, Brain Res. MoL Brain Res. 32: 233-40, 1995; Bourbonniere and 

20 Nalbantoglu, Brain Res. MoL Brain Res. 35: 304-308, 1996; Quitschke et al. t J. BioL 
Chem. 271: 22231-9, 1996), and it has been demonstrated that 96 base pairs 5 1 to the 
transcriptional start site are sufficient for cell-type-specific promoter activity in tissue 
culture (Quitschke and Goldgaber, J. BioL Chem. 267: 17362-17368, 1992). The 96 
base pairs can be fused to a minigene engineered such that alterations are made in the 

25 ribozyme recognition site to prevent cleavage of the replacement p-APP transcript and 
in the GAGA motifs to inhibit slippage of the transcriptional machinery such as 
produces the mutant transcripts in the first place; these replacements should be 
performed such that translationally "silent' 1 mutations are introduced in each case. 
Examples of such changes are shown in Table 1 0. 
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EXAMPLES 

Preparation of Nucleic Acid Vectors 

Sequences encoding ribozymes or a wild-type version of a mutant RNA, or 
an antisense (complementary) mutant oligonucleotide sequence may be cloned into an 
5 appropriate vector for expression in a desired mammalian celL The vector will include 
a promoter that is expressed in the target cell type, and also may include an enhancer 
and locus control region, as selected for expression in"a given cell type. Examples of 
vectors useful according to the invention include but are not limited to any vector 
which results in successful transfer of the coding sequences to the target mammalian 

10 cell. A nucleic acid may be transfected for use in the invention using a viral (e.g. 
adenoviral or retroviral) or non-viral DNA or RNA vector, where non-viral vectors 
include, but are not limited to, plasmids, linear nucleic acid molecules, artificial 
chromomosomes and episomal vectors. Expression of heterologous genes has been 
observed after injection of plasmid DNA into muscle (Wolff J. A. et al., 1990, Science . 

is 247: 1465-1468; Carson D.A. et al., US Patent No. 5,580,859), thyroid (Sykes et al., 
1994, Human Gene Ther. . 5: 837-844), melanoma (Vile et al., 1993, Cancer Res. . 53: 
962-967), skin (Hengge et al., 1995. Nature Genet. . 10: 161-166), liver (Hickman et 
al., 1994, Human Gene Therapy . 5: 1477-1483) and after exposure of airway 
epithelium (Meyer et al., 1995, Gene Therapy . 2: 450-460). 

20 For example, the retroviral gene transfer vector SAX (Kantoff et al., Proc. 

Nat. Aca. Sci. 83:6563, 1986) may be used to insert a selected coding sequence into a 
target cell. SAX is a moloney virus based vector with the neoR gene promoted from 
the retroviral LTR and the human ADA gene promoted from an internal SV40 
promoter. Thus, the SAX vector may be engineered by one of skill in the art to 

2 5 contain the coding sequence for a ribozyme, or a wild-type RNA, or a selected 

antisense sequence, identified as described herein, e.g., by substituting the desired 
coding region for the hADA coding region in the SAX vector. 

Expression vectors are known in the art which encode, or may be engineered 
to encode, a selected ribozyme. Yuyama et al., Nucl. Acids Res. 22:5060, 1994, 

3 0 describe a multifunctional expression vector encoding several ribozymes. This vector 
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may be adapted to encoded a ribozyme of a selected specificity by substituting one or 
both ribozyme sequences in the vector for a selected ribozyme sequence. Zhou et al., 
Gene 149:33, 1994, and Yamada et al., Virology 205;121, 1994, describe retroviral 
transduction of ribozyme sequences into T cells. These retroviral vectors may be 
5 adapted to encode a selected ribozyme sequence. Liu et al., Gene Therapy 1 :32, 1994, 
and Lee et al., Gene Therapy 2:377, 1995, describe expression vectors which are 
adaptable for use in expression of any nucleic acid sequence contemplated according 
to the invention. 



10 with the dosage formulation, and in such amount as will be prophylactically and/or 
therapeutically effective. When the end product (e.g. an antisense RNA molecule or 
ribozyme) is administered directly, the dosage to be administered is directly 
proportional to the the amount needed per cell and the number of cells to be 
transfected, with a correction factor for the efficiency of uptake of the molecules. In 

is cases in which a gene must be expressed from the nucleic acid molecules, the strength 
of the associated transcriptional regulatory seuqences also must be considered in 
calculating the number of nucleic acid molecules per target cell that will result in 
adequate levels of the encoded product. Suitable dosage ranges are on the order of, 
where a gene expression construct is administered, 0.5- to 1/zg, or 1- 10/ig, or 

20 optionally 10- 100 (ag of nucleic acid in a single dose. It is conceivable that dosages of 
up to Img may be advantageously used. Note that the number of molar equivalents 
per cell vary with the size of the construct, and that absolute amounts of DNA used 
should be adjusted accordingly to ensure adequate gene copy number when large 
constructs are injected. 

25 Nucleic acid molecules to be administered according to the invention may, 

for example, be formulated in a physiologically acceptable diluent such as water, 
phosphate buffered saline, or saline, and further may include an adjuvant; however, it 
is contemplated that other formulations may advantageously be employed. Adjuvants 
such as incomplete Freund's adjuvant, aluminum phosphate, aluminum hydroxide, or 

3 0 alum are materials well known in the art. Administration of a nucleic acid molecule 



Generally, nucleic acid molecules are administered in a manner compatible 
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as described herein may be either localized or systemic. Methods for both localized 
and systemic administration of a pharmacological composition are well known in the 
art. 

Nucleic acid constructs of use in the invention can be given in a single- or 
5 multiple dose. A multiple dose schedule is one in which a primary course of 
administration can include 1-10 separate doses, followed by other doses given at 
subsequent time intervals required to maintain and or reinforce the cellular level of the 
transfected nucleic acid. Such intervals are dependent on the continued need of the 
recipient for the therapeutic nucleic acid, the ability of a given nucleic acid to self- 

10 replicate in a mammalian cell if it does not become integrated into the recipient's 

genome and the half-life of a non-renewable nucleic acid (e.g. a molecule that will not 
self-replicate). Preferably, when the medical needs of the recipient mammal dictate 
that a nucleic acid or a product thereof will be required throughout its lifetime, or at 
least over an extended period of time, such as a year or more, a nucleic acid may be 

15 encoded by sequences of a vector that will self-replicate in the target cells. The 

efficacy of transfection and subsequent maintenance of the nucleic acid molecules may 
be assayed either by monitoring the activity of a marker gene, which may additionally 
be comprised by the transfected construct, or by the direct measurement of either the 
protein product encoded by the gene of interest or the reduction in the levels of a 

20 protein the production of which it is designed to inhibit. The assays can be performed 
using conventional molecular and biochemical techniques, such as are known to one 
skilled in the art. 

The success of treatment using nucleic acid molecules in the invention may 
be determined by the assessment of known clinical indicators (e.g., for a 

25 neurodegenerative disease, loss of cognitive or motor function). The progression or 
(if treatment is undertaken prophylactically on an patient believed to be at risk of 
disease) development of such symptoms in a treated individual is compared to those 
observed in untreated control subjects; if an improvement in the treated patient's 
condition is observed relative to that of control subjects, treatment is judged to be 

3 0 effective. The making of such an assessment is well within the knowledge of one of 
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skill in the art. 



EXAMPLE 7 



Liposomal Delivery Accordine to the Invention 

Substances may be administered according to the invention using any 
5 delivery means known in the art. Described below is liposomal delivery. Liposomes 
which are used to administer the substances described herein, e.g., a ribozyme can be 
of various types and can have various compositions. -The primary restrictions are that 
the liposomes should not be toxic to the living cells and that they should deliver their 
contents into the interior of the cells being treated. 

10 The use of pH sensitive liposomes to mediate the cytoplasmic delivery of 

calcein and FITC dextran has been described (see Straubinger et al., Cell 32:1069- 
1079, 1983; and Straubinger et al., FEBS Letters 179:148-154, 1985. Other 
discussions of pH sensitive liposomes can be found in chapter 1 1 of the book CELL 
FUSION, edited by A.E. Sowers, entitled "Fusion of Phospholipid Vesicles Induced 

is by Divalent Cations and Protons" by Nejat Duzgunes et al., Plenum Press, N.Y., 1987, 
241-267. See also Ellens et al., Biochemistry, 23:1532-1538, 1984, and Bentz et al., 
Biochemistry 26:2105-2116, 1987. 



membrane layers separating the internal and external compartments. The most 
20 important elements in liposome structure are that a sufficient amount of enzyme or 

nucleic acid be sequestered so that only one or a few liposomes are required to enter 

each cell for delivery of the substance, and that the liposome be resistant to disruption. 

Liposome structures include small unilamellar vesicles (SUVs, less than 250 

angstroms in diameter), large unilamellar vesicles (LUVs, greater than 500 angstroms 
25 in diameter), and multilamellar vesicles (MLs). In the example presented below, 

although SUVs are used to administer a ribozyme, the methods are applicable to 

administration of any substance described herein. 



molecular weight can be isolated from other liposomes and unincorporated enzyme by 
30 molecular sieve chromatography, which is precise but time consuming and dilutes the 



The liposomes may be of various sizes and may have either one or several 



SUVs can be isolated from other liposomes and unincorporated enzyme by 
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liposomes, or differential centrifugation, which is rapid but produces a wider range of 
liposome sizes. 

The liposomes may be made from natural and synthetic phospholipids, 
glycolipids, and other lipids and lipid congeners; cholesterol, cholesterol derivatives 

5 and other cholesterol congeners; charged species which impart a net charge to the 

membrane; reactive species which can react after liposome formation to link additional 
molecules to the liposome membrane; and other lipid soluble compounds which have 
chemical or biological activity. 

The liposomes useful according to the invention may be prepared, for 
10 example, as described in U.S. Patent No. 5,296,231, which describes preparation of 
liposomes containing a ribozyme, although it should be borne in mind that liposomes 
useful according to the invention may contain any one of the substances as herein 
described. Briefly, by combining a phospholipid component with an aqueous 
component containing the ribozyme (or desired substance) under conditions which 

15 will result in vesicle formation. The phospholipid concentration must be sufficient to 
form lamellar structures, and the aqueous component must be compatible with 
biological stability of the enzyme. Methods for combining the phospholipids onto 
glass and then vesicles will form include: drying the phospholipids onto glass and 
then dispersing them in the aqueous component; injecting phospholipids dissolved in a 

20 vaporizing or non- vaporizing organic solvent into the aqueous component which has 
previously been heated; and dissolving phospholipids in the aqueous phase with 
detergents and then removing the detergent by dialysis. The concentration of the 
ribozyme in the aqueous component can be increased by lyophilizing the enzyme onto 
dried phospholipids and then rehydrating the mixture with a reduced volume of 

25 aqueous buffer. SUV's can be produced from the foregoing mixtures either by 

sonication or by dispersing the mixture through either small bore tubing or through the 
small orifice of a French Press. 

Ribozymes incorporated into liposomes can be administered to living cells 
internally or topically. Internal administration to animals or humans requires that the 

3 0 liposomes be pyrogen- free and sterile. To eliminate pyrogens, pyrogen-free raw 
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materials, including all chemicals, enzymes, and water, are used to form the 
liposomes. Sterilization can be performed by filtration of the liposomes through 0.2 
micron filters. For injection, the liposomes are suspended in a sterile, pyrogen-free 
buffer at a physiologically effective concentration. Topical administration also 
requires that the liposome preparation be pyrogen-free, and sterility is desirable. In 
this case, a physiologically effective concentration of liposomes can be suspended in a 
buffered polymeric glycol gel for even application to the skin. In general, the gel 
should not include non-ionic detergents which can disrupt liposome membranes. 
Other vehicles can also be used to topically administer the liposomes. 

The concentration of the substance in the final preparation can vary over a 
wide range, a typical concentration being on the order of 50 ug/ml. In the case of pH 
sensitive liposomes, lower concentrations of the substance can be used, e.g., on the 
order of 0.01 to 1.0 ug/ml for liposomes administered to cells internally. In case of 
topical application, higher liposome concentrations used, e.g., ten or more times 
higher. 

EXAMPLE 8 
Administration Across the Blood-Brain Barrier 

Where it is desired according to the invention to administer a substance as 
described herein or its coding sequence, or liposomes containing such substances, to 
an individual such that the administered material crosses the blood-brain barrier, 
several methods are known in the art. 

For example, a substance to be administered, whether it be protein or nucleic 
acid or liposome, may be co-administered with a polypeptide, for example a lipophilic 
polypeptide that increases permeability at the blood-brain barrier. Examples of such 
polypeptides include but are not limited to bradykinin and receptor mediated 
permeabilizers, such as A-7 or its conformational analogues, as described in U.S. 
Patent Nos. 5,1 12,596 and 5,268,1 64. The permeabilizing polypeptide allows the co- 
administered ribozyme, coding sequence or liposome to penetrate the blood-brain 
barrier and arrive in the cerebrospinal fluid compartment of the brain, where the 
ribozyme, or coding sequence may then reach and enter a target neuronal cell. 
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Alternatively, the substance to be administered may be coupled to a steroidal estrogel 
or androgel to increase binding to steroid receptors and thus access to the brain. 

Another exemplary method for administering a substance such as a ribozyme, 
antibody, nucleic acids, or liposomes containing such molecules, according to the 
5 invention includes forming a complex between the substance to be administered and 
an antibody that is reactive with a transferrin receptor, as described in U.S. Patent No. 
5,182,107. The complex may include a cleavable or*ibn-cleavable linker and is 
administered under conditions whereby binding of the antibody to a transferrin 
receptor on a brain capillary endothelial cell occurs and the substance is transferred 
10 across the blood-brain barrier in active form. 

Ex vivo therapy 

It is possible to administer a therapeutic nucleic acid for use not only in in 
vivo therapy (i.e., that in which a nucleic acid is administered directly to a patient for 

15 uptake by- and subsequent expression in cells in situ) but also in ex vivo therapy (i.e., 
that in which a nucleic acid is administered to cultured or explanted cells in vitro, 
which transfected cells are subsequently transplanted into the clinical patient in order 
to supply a therapeutic product). Methods of ex vivo gene therapy are described in 
detail herein. By these methods, a plasmid which continues to be maintained in a 

20 transformed or transfected cell after such a cell has been administered (e.g. via 

transplantation) to a multicellular host, such as a mammal, delivers a gene product to 
that individual. It is contemplated that a gene of interest, particularly a therapeutic 
gene, will be expressed by the transplanted cell, thereby providing the recipient 
organism, particularly a human, with a needed RNA (e.g., an antisense RNA or 

25 ribozyme) or protein. 

A cell type may be used according to the invention which is amenable to 
methods of nucleic acid transfection such as are known in the art. Such cells may 
include cells of an organism of the same species as the recipient organism, or even 
cells harvested from the recipient organism itself for ex vivo nucleic acid transfection 

3 0 prior to re-introduction. Such autologous cell transplants are known in the art. One 
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common example is that of bone marrow transplantation, in which bone marrow is 
drawn either from a donor or from a clinical patient (for example, one who is about to 
receive a cytotoxic treatment, such as high doses of ionizing radiation), and then 
transplanted into the patient via injection, whereupon the cells re-colonize bones and 
5 other organs of the hematopoietic system, 
a. Cell dosage 

The number of transfected cells which are administered to a recipient 
organism is determined by dividing the absolute amount of therapeutic or other gene 
product required by the organism by the average amount of such an agent which is 

10 produced by a transfected cell. Note that steady-state plasmid copy number varies 
depending on the strength of its origin of replication as well as factors determined by 
the host cell, environment, the availability of nucleotides and replicative enzyme 
complexes, as does the level of expression of the gene of interest encompassed by the 
plasmid, which level likewise is determined by the strength of its associated promoter 

is and the availability of nucleotides and transcription factors in a given host cell 

background. As a result, the level of expression per cell of a given gene of interest 
must be determined empirically prior to administration of cells to a recipient. 

While efficient methods of cell transfection and transplantation are known in 
the art, they do not ensure that the transfected cell is immortal. In addition, the 

20 requirements of the recipient organism for the product encoded by the transgene may 
change over time. In light of these considerations, it is contemplated that cells may be 
administered in a single dose or in multiple doses, as needed. A multiple dose schedule 
is one in which a primary course of administration can include 1-10 separate doses, 
followed by other doses given at subsequent time intervals required to maintain and or 

25 reinforce the cellular level of the transfected nucleic acid. Such intervals are 

dependent on the continued need of the recipient for the therapeutic gene product. 
Preferably, when the medical needs of the recipient mammal dictate that a gene 
product will be required throughout its lifetime, or at least over an extended period of 
time, such as a year or more, the transfected cells will be replenished on a regular 

30 schedule, such as monthly or semi-monthly, unless such cells are able to colonize the 
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recipient patient in permanent fashion, such as is true in the case of a successful bone- 
marrow cell transplant. 

b. Nucleic acid dosage 

Provided a nucleic acid vector capable of replication in the transfected cell is 
5 used, the absolute amount of nucleic acid which is transfected into cells prior to 

transplantation is not critical, since in cells receiving at least one copy of such a vector, 
the vector will replicate until an equilibrium copy-number is achieved. As a first 
approximation, an amount of vector equivalent to between 1 and 10 copies thereof per 
cell to be transfected may be used; one of skill in the art may adjust the ratio of 

10 plasmid molecules to cells as is necessary to optimize vector uptake. Of particular 
used in the invention are vectors or transfection techniques which result in the stable 
integration of the gene of interest into the chromosome of the transfected cell, so as to 
avoid the need to maintain selection for cells bearing the vector following 
transplantation into a recipient multicellular organism, such as a human. 

is c. Administration of autologous or syngeneic cells 

A cell type which is commonly transplanted between individuals of a single 
species (or, even, from an individual to a cell culture system and back to the same 
individual) is that of hematopoietic stem cells (HSCs), which are found in bone 
marrow; such cells have the advantage that they are amenable to nucleic acid 

20 transfection while in culture, and are, therefore, well suited for use in the invention. 
Cultures of HSCs are transfected with a minimal plasmid comprising an operator 
sequence and a gene of interest and the transfected cells administered to a recipient 
mammal in need of the product of this gene. Transfection of hematopoietic stem cells 
is described in Mannion-Henderson et al., 1995, Exp. Hematol. . 23: 1628; Schiffinann 

2 5 etal., 1995, Blood, 86: 1218; Williams. 1990. Bone Marrow Transplant . 5: 141; 
Boggs, 1990, Int. J. Cell Cloning . 8: 80; Martensson et al., 1987, Eur. J. Immunol. . 
17: 1499; Okabe et al., 1992, Eur. J. Immunol- 22: 37-43; and Banerji et al., 1983, 
Cell, 33: 729. Such methods may advantageously be used according to the present 
invention. Administration of transfected cells proceeds according to methods 

30 established for that of non-transfected cells, as described below. 
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The transplantation of hematopoietic cells, such as in a bone marrow 
transplant, is commonly performed in the art by procedures such as those described by 
Thomas et al. (1975, New England J. Med. T 292: 832-843) and modifications thereof. 
Such a procedure is briefly summarized: In the case of a syngeneic graft or of a 
5 patient suffering from an immunological deficiency, no immunosuppressive pre- 
treatment regiment is required; however, in cases in which a cells of a non-self donor 
are to be administered to a patient with a responsive immune system, an 
immunosuppressive drug must be administered, e.g. cyclophosphamide (50 mg/kg 
body weight on each of four days, with the last does followed 36 hours later by the 

10 transplant). Leukemic patients routinely receive a 1000-rad midline dose of total-body 
irradiation in order to ablate cancerous blood cells; this irradiation also has an 
immune-suppressive effect. Following pre-treatment, bone marrow cells (which 
population comprises a small number of pluripotent hematopoietic stem cells, or 
HSCs), are administered via injection, after which point they colonize the 

is hematopoietic system of the recipient host. Success of the graft is measured by 
monitoring the re-appearance of the numerous adult blood cell types by the 
immunological and molecular methods which are well known in the art. While as few 
as 1-10 HSCs are, in theory, able to colonize and repopulate a lethally-irradiated 
recipient mammal over time, it is advantageous to optimize the rate at which 

20 repopulation occurs in a human bone marrow transplant patient; therefore, a 

transplanted bone marrow sample comprising 10 to 100, or even 100 to 1000 HSCs 
should be administered in order to be therapeutically effective. 

It is contemplated that both lymphoid and parenchymal cells are of use in the 
invention. Such parenchymal cells include those of the islets of Langerhans, the 

2 5 thyroid, the adrenal cortex, muscles, cartilagenous- or other synovial tissue, the 

kidneys, epithelial tissues (both external and internal, particularly that of the intestinal 
lumen, lung, heart, liver, kidney, neurons and synovial cells) and, in particular, the 
nervous system. 

To render the transplanted cells resistant, at least collectively, to immune rejection 

3 0 by the recipient organism, it is contemplated that transplanted cells expressing a high 
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level of activated NFkB (a high NFkB "set point"), while still subject to destruction by 
autoimmune host lymphocytes, would enjoy the advantage of robust proliferative 
capacity in order to multiply at a rate surpassing that of cell killing, thereby providing 
a long-lived population of therapeutic cells to the recipient organism. Such cells may 
be transfected with gene expression constructs which result in the production of high 
levels of activated NFkB, or may be cells obtained from a donor selected for high 
endogenous NFkB activity, as may be determined in an in vitro transcription assay or 
DNA/protein binding assay by methods well known in the art, using protein extracts 
drawn from such a donor, which may, itself, be a transgenic mammal, 
d. Administration of xenogeneic and allogeneic cells 

While transfection and subsequent tranplantation of cells which are obtained 
from an individual or cell culture system of like species with the recipient organism 
may be performed, it is equally true that the invention may be practised using cells of 
another organism (such as a well-characterized eukaryotic microorganism, e.g. yeast, 
in which appropriate processing of proteins encoded by therapeutic genes is likely and 
in which useful origins of replication are known). In such a case, certain concerns 
must be addressed. 

First, when a protein is encoded by the gene of interest, the transplanted cells 
must produce the protein in a form that may is of use to the recipient organism. Post- 
translational processing (including, but not limited to, cleavage and patterns of 
glycosylation) must be consistent with proper function in the recipient. In addition, 
either a protein or an RNA molecule of interest must be made available to the recipient 
after synthesis, such as by secretion, excretion or exocytosis from the transplanted cell. 
To address the former, the protein produced by the transfected cells may be 
qualitatively compared to the native protein produced by an individual of the same 
species as the recipient organism by biochemical methods well known in the art of 
protein chemistry. The latter, release of the protein of interest by the cells to be 
transplanted, may be assayed by isolating protein from culture medium which has been 
decanted from the transfected cells or from which such cells have been separated (i.e. 
by centriftigation or filtration), and performing Western analysis using an antibody 
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directed at the protein of interest. Antibodies against many proteins are commercially 
available; techniques for the production of antibody molecules are well known in the 
art. 

Second, the cells must be shielded from immune rejection by the recipient 
5 organism. It is contemplated that such cells may be transfected with constructs 
expressing cell-surface markers (e.g. MHC antigens) characteristic of the recipient 
patient so as to provide them with biochemical camoflage. 

In addition, methods for the encapsulation of living cultures of cells for 
growth either in an artificial growth environment, such as in a fermentor, or in a 

10 recipient organism have been developed, and are also of use in the administration of 
cells transfected according to the invention. Such an encapsulation system renders the 
cell invisible to immune detection and, in addition, allows for the free exchange of 
materials (e.g. the gene product of interest, oxygen, nutrients and waste materials) 
between the transplanted cells and the environment of the host organism. 

15 Methods and devices for cell encapsulation are disclosed in numerous U.S. 

Patents; among these are Nos. 4,353,888; 4,409,31 1; 4,673,566; 4,744,933; 4,798,786; 
4,803,168; 4,892,538; 5,01 1,472; 5,158,881; 5,182,1 11; 5,283,187; 5,474,547; 
5,498,401 (which is particularly directed to the encapsulation of bacterial and yeast 
cells in chitosan); 5,550,050; 5,573,934; 5,578,314; 5,620,883; 5,626,561; 5,653,687; 

20 5,686,1 15; 5.693,513; and 5,698,413, the contents of which are fully incorporated by 
reference herein. Typically required for the successful culture of encapsulated cells is 
a selectively-permeable outer covering or 'skin' which is biocompatible (i.e., tolerated 
by both the encapsulated cells and the recipient host), and, optionally, a matrix in- or 
upon which cells are distributed such that the matrix provides structural support and a 

25 substrate to which anchorage-dependent cells may attach themselves. As relates to 
encapsulation devices applicable to use in the invention, the term "selectively- 
permeable** refers to materials comprising openings through which small molecules 
(including molecules of up to about 50,000 M.W. - 100,000 M.W.) may pass, but from 
which larger molecules, such as antibodies (approximately 150,000 M.W.), are 

3 0 excluded. Suitable covering materials include, but are not limited to, porous and/or 
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polymeric materials such as polyaspartate, polyglutamate, polyacrylates (e.g., aciylic 
copolymers or RL®, Monsanto Corporation), polyvinylidene fluoride, 
polyvinylidienes, polyvinyl chloride, polyurethanes, polyurethane isocyanates, 
polystyrenes, polyamides, cellulose-based polymers (e.g. cellulose acetates and 
5 cellulose nitrates), polymethyl-acrylate, polyalginate, polysulfones, polyvinyl alcohols, 
polyethylene oxide, polyacrylonitriles and derivatives, copolymers and/or mixtures 
thereof, stretched polytetrafluoroethylene (U.S. Pat. Nos. 3,953,566 and 4,187,390, 
both incorporated herein by reference), stretched polypropylene, stretched 
polyethylene, porous polyvinylidene fluoride, woven or non-woven collections of 

10 fibers or yarns, such as "Angel Hair" (Anderson, Science . 246: 747-749; Thompson et 
al., 1989, Proc. Natl. Acad. Sci. U.S.A. . 86: 7928-7932), fibrous matrices (see U.S. 
Pat. No. 5,387,237, incorporated herein by reference), either alone or in combination, 
or silicon-oxygen-silicon matrices (U.S. Patent No. 5,693,513). Polylysine having a 
molecular weight of 10,000 to 30,000, preferably 15,000 to 25,000 and most 

is preferably 17,000 is also of use in the invention (see U.S. Patent No. 4,673,566). 

Alternatively, the matrix material, comprising the transfected cells of the invention, is 
exposed to conditions that induce it to form its own outer covering, as discussed 
below. 

As described in U.S. Patent No. 5,626,561, the selective permeability of such 
20 a covering may be varied by impregnating the void spaces of a porous polymeric 

material- (e.g., stretched polytetrafluoroethylene) with a hydrogel material. Hydrogel 
material can be impregnated in substantially all of the void spaces of a porous 
polymeric material or in only a portion of the void spaces. For example, by 
impregnating a porous polymeric material with a hydrogel material in a continuous 

2 5 band within the material adjacent to and/or along the interior surface of a porous 

polymeric material, the selective permeability of the material is varied sharply from an 
outer cross-sectional area of the material to an inner cross-sectional area of the 
material. The amount and composition of hydrogel material impregnated in a porous 
polyhmeric material depends in large part on the particular porous polymeric material 

3 0 used to encapsulate cells for transplant. Examples of suitable hydrogel materials 
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include, but are not limited to, HYP AN® Structural Hydrogel (Hymedix International, 
Inc.; Dayton, NJ), non-fibrogenic alginate, as taught by Dorian in PCT/US93/05461, 
which is incorporated herein by reference, agarose, alginic acid, carrageenan, collagen, 
gelatin, polyvinyl alcohol, poly(2-hydroxyethyl methacrylate), poly(N-vinyl-2- 
5 pyrrolidone) or gellan gum, either alone or in combination. The matrix 

typically has a high surface-area:volume ratio, comprising pores or other spaces in- or 
on which cells may grow and through which fluids may pass; in addition, suitable 
matrix materials are stable following transplantation into a recipient organism. 
Preferably, the matrix comprises an aggregation of multiple particles, fibers or 

10 laminae. Alternatively, a matrix may comprise an aqueous solution, such as a 

physiological buffer or body fluid from the recipient organism (see U.S. Patent No. 
5,01 1,472). Suitable matrix materials include liquid, gelled, polymeric, co-polymeric 
or particulate formulations of aminated glucopolysachharides (e.g., deacetylated chitin, 
or "chitosan", which is prepared from the pulverized shells of crabs or other 

is crustaceans, and is commercially available as a dry powder; Cat. # C 3646, Sigma, St. 
Louis, MO), alginate (U.S. Patent No. 4,409,331), poly-p-l-5-N-acetylglucosamine 
(p-GIcNAc) polysaccharide species (either alone of formulated as co-polymer with 
collagen; see U.S. Patent No. 5,686,1 15), reconstituted extracellular matrix 
preparations (e.g. Matrigel®; Collaborative Research, Inc, Lexington, MA; Babensee 

20 et al., 1992, J. Biomed. Matr. Res. , 26: 1401), proteins, polyacrylamide, agarose and 
others. 

Methods by which cells become encapsulated using such materials are both 
numerous and varied. Encapsulation devices comprising a semi-permeable membrane 
material, as described above, may be pre-formed, filled with cells (e.g. by injection or 

25 other manual means) and then sealed (U.S. Patent Nos. 4,892,538; 5,01 1,472; 

5,626,56; and 5,653,687); such sealing may be effectively permanent (e.g. by the use 
of heat-sealing), semi-permanent (e.g. by the use of a biocompatible adhesive, such as 
an epoxy, which will not dissolve or degrade in an aqueous environment) or temporary 
(e.g. by the use of a removable cap or plug, or by shutting of a valve or stopcock). 

30 Methods of permanent and semi-permanent sealing are disclosed in U.S. Patent No. 
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5,653,687. As an alternative to the use of a pre- formed, semi-permeable cell reservoir, 
methods by which cells suspended in matrix material and the substance which is to 
form the outer covering of the encapsulation device are co-extruded under conditions 
which cause the celL/matrix mixture, which may be in liquid or semi-liquid (i.e., 
5 gelled) form to be encased in a continuous tube of the semi-permeable polymer, which 
either forms, or becomes crosslinked, under the extrusion conditions; such an 
extrusion procedure may lead to the formation of capsules which have only one cell 
reservoir (U.S. Patent No. 5,283,187) or which are divided into multiple, discrete 
compartments (U.S. Patent No. 5,158,881). As an alternative to both types of 

10 procedure, a liquid or semi-liquid (i.e., gelled) cell/matrix mixture droplet is suspended 
either in an agent which induces 'curing' or crosslinking of the outer layer of matrix 
material to form a semi-permeable barrier (U.S. Patent Nos. 4,798,786 and 5,489,401) 
or in a solution of polymeric material (or monomers thereof), which will polymerize 
and/or crosslink upon contact with the cell/matrix droplet such that a semi-permeable 

is membrane is deposited thereon (U.S. Patent Nos. 4,353,888; 4,673,566; 4,744,933; 
5,620,883; and 5,693,513). 

One of skill in the art is well able to select the appropriate matrix and semi- 
permeable membrane materials and to construct a cell-encapsulation device as 
described above. 

20 Implantation of such a device is achieved surgically, via standard techniques, 

to a site af or near the anatomical location to which the product encoded by the gene 
on the gene of interest is to be delivered, as is deemed safest and most expedient. 
Such a device may take a convenient shape, including, but not limited to, that of a 
sphere, pellet or other capsule shape, disk, rod or tube; often, the shape of the device is 

25 determined by its method of synthesis. For example, one which is formed by co- 
extrusion of a cell suspension and a polymeric covering material is typically tubular, 
while one formed by the deposition of a covering on droplets comprising cells in 
matrix material might be spherical. As discussed above, the number of cells which 
must be implanted (and, therefore, encapsulated) is dependent upon the requirements 

3 0 of the recipient organism for the product of the transfected gene. The encapsulation 
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devices described above are typically small (most usefully, lOjim to 1mm in diameter, 
so as to permit efficient diffusion of substances back and forth between the outer 
covering and the cells most deeply embedded in the matrix), and it is contemplated 
that such devices may carry between 10 and 10 10 cells each. Should the need for larger 
5 numbers of cells be anticipated > a plurality (2, 10 or even 100 or more) of such in vivo 
culturing devices may be made and implanted in a given recipient organism. 

An encapsulated cell device may be intended for permanent installation; 
alternatively, retrieval of the device may be desirable, whether to terminate delivery of 
the product of the gene of interest to the recipient organism at the discretion of one of 

10 skill in the art, such as a physician (who must determine on a case-by-case basis the 
length of time for which a given cell implant is beneficial to the recipient organism) or 
to replenish the device with fresh cells after long-term use (i.e. months to years). To 
the latter end, an implantation device may usefully comprise a retrieval aid, such as a 
guidewire, and a cap or other port, such as may be opened and re-sealed in order to 

is gain access to the cell reservoir, both as described in U.S. Patent No. 4,892,538. 

Live cultures of encapsulated cells have been used successfully to deliver 
gene products to tissues of a recipient animal. U.S. Patent No. 4,673,566 discloses 
successful maintenance of normal blood sugar levels in a diabetic rat into which 
encapsulated rat islet of Langerhans cells were implanted; two administrations of 

20 3,000 cells each together were effective for six months, while a single dose of 1,000 
cells was effective for two months. 

Similarly, heterospecific transplantation of encapsulated islet cells has been 
demonstrated to treat diabetes successfully (dog islet cells to a mouse recipient, U.S. 
Patent No. 5.578,314; porcine islet cells to a mouse recipient, Sun et al., 1992, ASAIO 

25 JL 38: 124). It is believed that such an approach is promising for the clinical 
treatment of diabetes mellitus in humans (Calafiore, 1992, ASAIO J. . 38: 34). 

It is contemplated that these techniques, which have been applied 
successfully to untransfected cells, may be utilized advantageously with cells that are 
transfected with therapeutic nucleic acid molecules of use in the invention. 

3 0 e. Assay of efficacy of transplanted cells in a recipient organism 
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The efficacy of the transfected cells so administered and their subsequent 
maintenance in the recipient host may be assayed either by monitoring the activity of a 
marker gene, which may additionally be comprised by the transfected construct, or by 
the direct measurement of either the product (e.g. a protein) encoded by the gene of 
5 interest or the reduction in the levels of a protein the production of which it (an 
antisense message or ribozyme) is designed to inhibit. The assays can be performed 
using conventional molecular and biochemical techniques, such as are known to one 
skilled in the art, or may comprise histological sampling (z.e., biopsy) and examination 
of tranplanted cells or organs. 

10 Ln addition to direct measurements of protein or nucleic acid levels in blood 

or target tissues encoded by the gene of interest borne by the vector in 
transfectedAransplanted cells, it is possible to monitor changes in the disease state in 
patients receiving gene transfer via transplantation of cells in which the gene of 
interest is maintained and compare them to the progression or persistence of disease in 

15 patients receiving comparable cells transfected with vector constructs lacking the gene 



frameshift mutation may be treated in accordance with the invention, as described 
above, via in vivo , ex vivo or in vitro methods. For example in in vivo treatments, a 
ribozyme or a nucleic acid vector encoding a ribozyme, a wild-type RNA, an antisense 
RNA or DNA or a sequence encoding the antisense RNA, or a wild-type version of a 

2 5 hybrid wild-type/nonsense protein, can be administered to the patient, preferably in a 

pharmaceutically acceptable delivery vehicle and a biologically compatible solution, 
by ingestion, injection, inhalation or any number of other methods. The dosages 
administered will vary from patient to patient; an "effective dose" will be determined 
by the level of enhancement of function of the transferred genetic material balanced 

3 0 against any risk of deleterious side effects. Monitoring gene expression and/or the 



of interest. 



20 



Other Dosages and Modes of Administration 
A patient that is subject to a disease state which is associated with a 
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presence or levels of the encoded mutant RNA or protein or its corresponding "sense" 
protein will assist in selecting and adjusting the dosages administered. Generally, a 
composition including a nucleoprotein such as a ribozyme will be administered in 
single or multiple doses, as determined by the physician, in the range of 50 ug - 1 mg, 
5 or within the range of 100 ug - 500 ug. A composition including an oligonucleotide 
will be administered in a single dose in the range of 5 ng - 1 0 ug, or within the range 
of 100 ng - 500 ng. A composition including a wild-type RNA or a vector will be 
administered in a single dose in the range of 10 ng - 100 ug/kg body weight, preferably 
in the range of 100 ng - 10 ug/kg body weight, such that at least one copy of the 

10 sequence is delivered to each target cell. A composition including a protein, e.g., a 
wild-type version of a hybrid wild-type/nonsense protein, will be administered in 
single or multiple doses, as determined by the physician, in the range of 10 ug - 1 mg, 
or within the range of 100 ug - 50 ug. Any of the above dosages may be administered 
according to the body weight of the patient, as determined by the physician, 

is Ex vivo transduction is also contemplated within the present invention. Cell 

populations can be removed from the patient or otherwise provided, transduced with a 
vector in accordance with the invention, then reintroduced into the patient. The 
number of cells reintroduced into the patient will depend upon the efficiency of vector 
transfer, and will generally be in the range of 10 4 - 10 6 transduced cells/patient. 

20 The cells targeted for ex vivo gene transfer in accordance with the invention 

include any cells to which the delivery of the vector is desired, for example, neuronal 
cells or stem cells. 

Protein, nucleic acid, or cells administered according to the invention is 
preferably administered in admixture with a pharmaceutically acceptable carrier 

25 substance, e.g., magnesium carbonate, lactose, or a phospholipid to form a micelle, the 
carrier and protein, nucleic acid or cell together can form a therapeutic composition, 
e.g., a pill, tablet, capsule or liquid for oral administration to the mammal. Other 
forms of compositions are also envisioned, e.g., a liquid capable of being administered 
nasally as drops or spray, or a liquid capable of intravenous, parenteral, subcutaneous, 

3 0 or intraperitoneal administration. The substance administered may be in the form of a 
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biodegradable sustained release formulation for intramuscular administration. For 
maximum efficacy, where zero order release is desirable, e.g., an implantable or 
external pump, e.g., an Infusaid™ pump (Infusaid Corp, MA), may be used. 

5 Kits Useful According to the Invention 

The invention encompasses kits for diagnosis or treatment of diseases 
according to the invention. 

A diagnostic kit includes suitable packaging materials and one or more of the 
following reagents: a nucleic acid probe as defined hereinabove, and optionally means 
10 for detecting the probe when bound to its complementary sequences. For example, the 
nucleic acid probe may be labeled, e.g., radiolabeled, fluorescently labeled, etc., or 
may be detected via indirect labeling techniques, e.g., using a biotin/avidin system, 
well known in the art. 

A diagnostic system, preferably in kit form, comprises yet another 
is embodiment of this invention. This system is useful for assaying the presence of a 
hybrid wild-type/nonsense protein or its derivative in cells by the formation of an 
immune complex. This system includes at least one package that contains an antibody 
of this invention. Optionally, a kit also may include a positive tissue sample control. 

Antibodies are also utilized along with an "indicating group" also sometimes 
20 referred to as a "label". The indicating group or label is utilized in conjunction with 
the antibody as a means for determining whether an immune reaction has taken place, 
and in some instances for determining the extent of such a reaction. 

The terms "indicating group" or "label" are used herein to include single 
atoms and molecules that are linked to the antibody or used separately, and whether 
25 those atoms or molecules are used alone or in conjunction with additional reagents. 
Such indicating groups or labels are themselves well-known in immunochemistry and 
constitute a part of this invention only insofar as they are utilized with otherwise novel 
antibodies, methods and/or systems. 

For example, an antigen-specific antibody or antibody fragment is detectably 
3 0 labeled by linking the same to an enzyme and use it in an E1A, or enzyme-linked 
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immunosorbent assay (ELISA). This enzyme, in turn, when later exposed to a 
substrate in such a manner as to produce a chemical moiety which can be detected, for 
example, by spectrophotometric, flourometric or, most preferably, by visual means. 
The substrate may be a chromogenic substrate which generates a reaction product 
visible to the naked eye. 

Enzymes which can be used to detectably label the binding protein which is 
specific for the desired detectable mutant protein, indude, but are not limited to, 
alkaline phosphatase, horseradish peroxidase, glucose-6-phosphate dehydrogenase, 
staphylococcal nuclease, delta- V-steroid isomerase, yeast alcohol dehydrogenase, 
alpha-glycerophosphate dehydrogenase, triose phosphate isomerase, asparaginase, 
ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase and 
acetylcholinesterase. 

By radioactively labeling the binding protein, for example, the antibody, it is 
possible to detect the antigen bound to a solid support through the use of a 
radioimmunoassay (RIA). The radioactive isotope can be detected by such means as 
the use of a gamma counter or a scintillation counter or by autoradiography. Isotopes 
which are particularly useful for the purpose of the present invention are: 3 H, 13I I, 14 C» 
and preferably I25 L 

It is also possible to label the first or second binding protein with a 
fluorescent compound. When the fluorescently labeled antibody is exposed to light of 
the proper wave length, its presence can then be detected due to fluorescence. Among 
the most commonly used fluorescent labelling compounds are fluorescein 
isothiocyanate, rhodamine, phycoerythrin, phycocyanin, allophycocyanin, q~ 
phthaldehyde and fluorescamine. 

The first or second binding protein also can be detectably labeled by coupling 
it to a chemiluminescent compound. The presence of the chemiluminescent-tagged 
antibody is then determined by detecting the presence of luminescence that arises 
during the course of a chemical reaction. Examples of particularly useful 
chemiluminescent labeling compounds are luminol, isoluminol, theromatic acridinium 
ester, imidazole, acridinium salt and oxalate ester. 
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Likewise, a bioluminescent compound may be used to label the first or 
second binding protein. Bioluminescence is a type of chemiluminescence found in 
biological systems in which a catalytic protein increases the efficiency of the 
chemi luminescent reaction. The presence of a bioluminescent protein is determined 
5 by detecting the presence of luminescence. Important bioluminescent compounds for 
purposes of labeling are luciferin, luciferase and aequorin. 

The invention also includes diagnostic reagents for use in the present 
invention, such as nucleic acid sequences, probes and antibody molecules, and/or 
positive tissue controls, as described above, and kits including such reagents for use in 
1 o diagnosing or treating a disease. 

An indicating group or label is preferably supplied along with the antibody 
and may be packaged therewith or packaged separately. Additional reagents such as 
hydrogen peroxide and diaminobenzidine, and nickel ammonium sulfate may also be 
included in the system when an indicating group such as HRP is utilized. Such 
is materials are readily available in commerce, as are many indicating groups, and need 
not be supplied along with the diagnostic system. In addition, some reagents such as 
hydrogen peroxide decompose on standing, or are otherwise short-lived like some 
radioactive elements, and are better supplied by the end-user. 



- It will be understood that the invention is described by way of illustration 
only. Many other embodiments of the present invention in addition to those herein 
described will be apparent to those skilled in the art from the description herein given 
without departing from the scope of the present invention as defined in the appended 



20 



OTHER EMBODIMENTS 



25 



claims. 
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Table 1 

EARLY ONSET 

10-20% of total number of AD cases 
familial 

60% 90% unknown (54) 

33% PS1 

10% autosomal dominant (6) 5% APP 

<1%PS2-~ 
60% unknown 

40% sporadic 
LATE ONSET 

80-90% of total number of AD cases 



familial 



40% 



90% unknown (36) 

10% autosomal dominant* (4) 



60% sporadic 



Based upon a population-based cross-section study of dementia, 72% of all 
demented people suffer from Alzheimer's disease (AD) f 16% from vascular dementia, 
6% from Parkinson's disease dementia and 6% of other dementias (Ott et al., 1995). 
Early (<65 years) and late (>65 years) onset (EOAD and LOAD) forms of Alzheimer's 
disease are distinguished. Familial means that AD was observed in relatives of the 
first degree. This study is based upon Ott et aL, 1995; Van Broeckhoven, 1995; Cruts 
et al. t 1998. 

In familial EOAD the majority (54%) is not yet linked to a chromosome, whereas 6% 
is inherited in a an autosomal dominant way and linked to chromosome 1 (PS2, 
<1%), 14(PS1, 33%), 19 (APP, 5%), whereas 60% of the autosomal dominant forms 
is still not linked. In familial LOAD, the majority (90%) is not yet linked to a 
chromosome, whereas 10% is inherited in an autosomal dominant way. A subset 
may be linked to chromosome 12 (Pericak-Vance et al., 1997) and ApoE4 nuclear 
families. 

Risk factors: 65% of all EOAD and 25% of all LOAD cases display ApoE4 
polymorphism (one or two E4 alleles). ApoE4 data in early onset AD are based upon 
a study by Van Broeckhoven and Cruts (n = 102 patients). Other risk factors for late 
onset AD are butyrylcholinesterase and cytochrome c oxidase. 
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Table 2 

Clinico-pathological information of controls and AD patients. 



NBB/ 

autopsy 

no. 



Age 
(years) 



Sex 
(m/f) 



Dementia 

duration 

(years) 



GDS 



Postmortem 

delay 

<h) 



Fixation 
duration 
(days) 



Brain 
weight 

(g) 



Cause of death 



Non-demented controls 



89003 



81021 
94119 
94125 

88037 

90073 
90079 

91026 
91027 

90080 



81007 
90083 



34 



43 
51 
51 

58 

65 
72 

80 
82 

85 



90 
90 



m 



m 
f 

m 
m 
f 



f 
f 

m 



<17 



23 
8 
6 

41 

41 
4 

36 
<55 



12 

5 



1124 



53 
41 
47 

1088 

403 
126 

65 
38 

126 



48 
143 



1348 



1260 
1156 
1518 

1797 

1234 
1330 

1205 
1100 

1050 



1110 
1040 



empyema of pleura, 
fibrous pleuritis and 
fibrous pericarditis, 
AIDS 

non-Hodgkin lymphoma 
sepsis 

progressive liposarcoma 
and ileus 
lung carcinoma, 
massive hemorrhage 
pulmonary embolism 
myocardial infarction, 
cardiogenic shock 
cardiogenic shock 
myocardial infarction, 
ventricular fibrillation 
cardiac failure, myo- 
cardial infarction, 
coronary sclerosis, lung 
emphysema 
postoperative infecti ns 
metabolic acidosis 



Alzheimer cases 



89057 
86055 
90102 
91092 
85013 

92054 
88073 



40 
45 
49 
54 
56 

61 
66 

70 



83002 
cachexia 
93047 70 



91094 

90118 
93044 
93087 
90015 
93045 

91081 

88028 
90117 
91086 
86002 
93048 



73 

77 
77 
81 
81 
83 

85 

85 
86 
88 
90 
92 



m 

m 

m 

f 

f 

m 
m 



m 
f 

m 

m 

m 

f 

f 



f 

m 
m 
f 
f 



5-6 
11 
6 
5 

4-5 

3 
±15 

12 

12 

11 

7 

>5 
6 
6 
14 



5 
10 
4 

>8 
3 



Downs' syndr me 
93162 54 r* 
92080 58 r* 



89005 59 r* 
93161 62 r* 
96015 63 r* 



11 
8 



5 
11 



7 

>4 
7 
6 
7 

6 
7 

7 

7 

7 

7 

na 
6 
6 
7 



4 

7 
5 

>5 
7 



3 
4 
4 
3 

22 

6 
3 

13 



4 
4 
4 
3 
5 



28 1410 AD, cachexia 

3640 1130 AD. cachexia 

33 1426 AD, epilepsy 
78 1055 AD. cachexia 

48 1180 AD. bronchopneumonia, 

dehydration 
30 1180 AD, renal insufficiency 

30 1270 AD, ischemic cerebral 

stroke, cachexia, sepsis 

34 760 AD, status epilepticus, 

125 1325 AD. urinary tract 

infection 

66 1 1 06 AD, dehydration, 

circulation failure 
75 1168 AD, pneumonia 

127 1095 AD, bronchial pneumonia 

66 1088 AD. bacterial infection 

28 1020 decompensatio cordis 

127 1005 AD, cachexia, urinary 

tract infection 
39 1060 AD, metastasis 

digestive tract 
180 1020 AD, hypovolemic shock 

77 1303 AD, uraemia 

75 1058 AD, decompensatio cordis 

38 1060 AD, dehydration 

124 896 AD, cachexia, uraemia 



na 
7 



7 

na 



<17 
10 



5 
17 
24 

71 



614 
140 



29 
585 
87 



730 DS, bronchopn umonia 
712 DS, epil psia, 

pn umonia, decompensatio 

cordis 

812 DS. cardiac arrest 
1 100 DS, pneum nia 
980 DS. cardiac-respiratory, 
insufficiency 
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94058 64 m~ 3 7 7 47 875 DS. dehydration. 

pneumonia 

93028 67 r* 8 7 11 104 859 DS, pneumonia 



NBB = Netherlands Brain Bank, *7— = karyotype 47XX21/47XY21. na = not available, GDS = Global deterioration scale (B. 
Reisberg, F.H. Ferris, M.J. DeLeon and T. Crook, Am. J. Psych. 139, 1136, 1982). 'non-demented 



72 



WO 98/45322 



PCT/IB98/00705 



Table 3 Immunoreactivities in the human frontal cortex (Brodman area 11) for mutant P amyloid precursor protein and 
ubiquitin-B, the mRNA of which is expressed in the +1 reading frame (PAPP* 1 and Ubi-B Tissues were obtained fr n 
controls and neuropathologically confirmed Alzheimer and Down syndrome cases. 



NBB 

autopsy 

no. 



age sex neuropatho- 

(years) (m/f) logical state* 

plaques tangles 



pAPP* 



Ubi-B* 



Non-demented controls 



89003 


34 


m 


- 


- 


* 


81021 


43 


m 








94119 


51 


f 


- 


- 


- 


94125 


51 


m 








88037 


58 


m 


- 


- 


- 


90073 


65 


f 


- 


- 


- 


90079 


72 


m 




- 


- 


91026 


80 


f 


- 


- 


- 


91027 


82 


f 


- 


- 


- 


90080 


85 


m 




+* 


- 


81007 


90 


f 


- 


+* 


- 


90083 


90 


f 




- 


- * 


% pos. staining 








0% 0% 


Alzheimer cases 










89057 


40 


m 


+ b 


+ c 


+ 


86055 


45. 


m 


+ b 


+ c 


+ 


90102 


49 


m 


+ c 


+ c 


+ 


91092 


54 


f 


+ c 


+ c 


+ 


85013 


56 


f 


+ b 


+ b 




92054 


61 


m 


+ c 


♦ c 


+ 


88073 


66 


m 


+ c 


+ c 


+ 


83002 


70 


f 


+ c 


+ c 




93047 


70 


m 


+ c 


+ c 


+ 


91094 


73 


f 


+c 


+ c 


+ 


90118 


77 


m 




+ c 


+ 


93044 


77 


m 




+» 




93087 


81 


m 


+ 8 


+ D 


+ 


90015 


81 


f 


+ b 


+* 


+ + 


93045 


83 


f 


+» 


+ c 


+ 


91081 


85 


f 


+ J 


+» 




88028 


85 


f 


+ b 




+ 


90117 


86 


. m 


+ b 




+ 


91086 


88 


* m 


+» 


+• 


+ 


86002 


90 


f 


+ c 


+» 


+ + 


93048 


92 


f 








% pos. staining 








19% 80% 


Downs' syndrome 










93162 


54 


f 


+ c 


+ c 


+ + 


92080 


58 


f 


+ b 


+« 




89005 


59 


f 


+<> 


+ 6 




93161 


62 


f 


+ c 






96015 


63 


f 








94058 


64 


m 


+*> 




+ + 


93028 


67 


f 






+ + 


% pos. staining 








86% 86% 



silver staining: a) few, b) moderate, c) many. 
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7,w f l ?^ nor S?i rv,t i es jn,! h . e human temporal cortex (Brodman area 38) for mutant p amyloid precursor protein and 
ubiquitin-B, the mRNA of which is expressed in the +1 reading frame (PAPP* and Ubi-B M ). Tissues were obtained from 
controls and neuropathologies confirmed Alzheimer cases. Down syndrome patients showed Alzheimer pathology. 



NBB 



autopsy 
no. 


age 
(years) 


sex 
(m/f) 


neuropatho- 
logical state' 
plaques tangles 


PAPP* 1 


Ubi-B* 1 


Non-demented controls 












69003 


34 


m 










81021 


43 


m 










94119 


51 


f 










94125 


51 


m 










88037 


58 


m 










90073 


65 


f 










90079 


72 


m 










91026 


80 


f 










91027 


82 


f 










90080 


85 


m 


+ b 


+ b 




+ 


81007 4 


90 


f 


+» 








90083 


90 


f 


+*> 








% pos. staining 

Alvhoimar #~ *- «» 








0% 


8% 



89057 
86055 
90102 
91092 
85013 
92054 
88073 
83002 
93047 
91094 
90118 
93044 
93087 
90015 
93045 
88028 
91081 
90117 
91086 
86002 
93048 



40 
45 
49 
54 
56 
61 
66 
70 
70 
73 
77 
77 
81 
81 
83 
85 
85 
86 
88 
90 
92 



% pos. staining 



m 

m 

m 

f 

f 

m 
m 
f 

m 
f 

m 

m 

m 

f 

f 

f 

f 

.m 
-m 

f 

f 



+» 

+ c 
+* 
+ a 
+ c 

+» 
+» 
+ c 

+» 
+ c 



+ c 
+ c 

+» 



+ 



+ 
+ 
+ 
+ 
+ 
+ 
+ 



+ 
+ 
+ 
+ 
+ 
+ 
+ 

+ 



43% 



95% 



Downs 1 syndrome 
93162 
92080 
89005 
93161 
96015 
94058 
93028 

% pos. staining 



54 


f 


♦ c 


58 


f 


+* 


59 


f 


+ b 


62 


f 


+ e 


63 


f 


+• 


64 


m 




67 


f 





+ 
+ 



+ 



+ 
+ 



86% 



86% 
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Table 5 Immunoreactivities in the human hippocampus for mutant B amyloid precursor protein and ubiquitin-B, the mRNA 
of which is expressed in the +1 reading frame (pAPP M and Ubi-B M ). Tissues were obtained from controls and 
neuropathologically confirmed Alzheimer and Down syndrome cases. 



NBB 

autopsy 

no. 


age 
(years) 


sex 
(m/f) 


neuropatho- 
logical state' 
plaques tangles 


pAPP* 1 


Ubi-B* 1 


Non-demented controls 












89003 


34 


m 










81021 


43 


m 










94119 


51 


f 










94125 


51 


m 










88037 


58 


m 




+* 






90073 


65 


f 










90079 


72 


m 


~+ b 


+ c 




+ 


91026 


80 


f 


+ b 


+ c 




+ 


91027 


82 


f 




+ c 




+ 


90080 


85 


m 


+ b 


+ b 




+ 


81007 


90 


f 


+ b 


+ b 




+ 


90083 


90 


f 




+■ 




+ 


% pos. staining 








0% 


50% 



Alzheimer 

89057 

86055 

90102 

91092 

85013 

92054 

88073 

83002 

93047 

91094 

90118 

93044 

93087 

93045 

88028 

91081 

90117 

91086 

86002 

93048 



cases 
40 
45 
49 
54 
56 
61 
66 
70 
70 
73 
77 
77 
81 
83 
85 
85 
86 
88 
90 
92 



m 

m 

m 

f 

f 

m 

f 

f 

m 
f 

m 

m 

m 

f 

f 

f 

m 

.m 
-f 
f 



+» 

+ b 
+ b 
+ b 
+ b 
+ b 
+ c 
+ c 
+ c 
+ b 
+ b 
+ b 
+ b 
+ b 
+» 



+ c 
+ c 
+ c 
+ c 
+ a 
+ c 
+ c 
+ c 
+ c 
+ c 
+ c 
+ c 
+ c 
+ c 
+ b 
+ c 
+ c 
+ c 
+ c 
+ c 



+ 
+ 

+ 
+ 
+ 



% pos. staining 



50% 



95% 



Downs' syndrome 

93162 54 

92080 58 

89005 59 

93161 62 

96015 63 

94058 64 

93028 67 

% positive staining 



m 
f 



+ b 



+ 



+ 



71% 



86% 



NBB = Netherlands Brain Bank, * Numb r of plaqu s (all types) (1) and tangles as rev al d by Congo Red and Bodian 
silver staining: a) few, b) mod rat , c) many. In th absenc of hippocampal tissue, patient #90015 was not studied. 
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Table 6 Immunoreactivities in the human frontal and temporal cortex and hippocampus for P amyloid 
precursor protein and ubiqurtin-B of which the mRNA is expressed in the +1 reading frame (resulting 
in (JAPP* 1 and Ubi-B +1 protein). Tissues were obtained from controls and neuropathologically 
confirmed Alzheimer and Down syndrome cases. Immunoreactivity present in tangles, dystrophic 
neurites and neuritic plaques of patients is expressed as a percentage of the total number of patients 
studied. 



Frontal cortex (area 11) Temporal cortex (area 38) Hippocampus 
PAPP +1 Ubi-B +1 pAPP* 1 Ubi-B +1 PAPP* 1 Ubi- 



Non dementing controls 1 (n=12) 0 0 0 8* 0 
50* 

Alzheimer's disease 2 (n=21) 19 80 43 95 50 95 

Down syndrome 3 (n=7) 86 86 86 86 71 86 



1 young (n=6) and aged (n=6) non-demented controls. 

2 early (<65 years, n=10) and late (>65 years, n=1 1) onset Alzheimer 

3 One Down syndrome patient (#96015; Tables 2-5) did not show any signs of dementia or 

neuropathology and was immunonegative for PAPP +1 and Ubi-B* 1 . 

Controls were matched for sex, age and postmortem delay. 

*in old non-demented patients with age related neuropathology (tangles, plaques) 

In 1 1 Parkinson patients no reaction for PAPP* 1 or Ubi-B +1 was found in the nigrostriatal system, 
except for one patient who also suffered from Alzheimer's disease. 

When the three brain areas studies were taken together, pAPP* 1 immunoreactive structures 
were present in 71% and Ubi-B* 1 immunoreactive structures in 100% of the Alzheimer patients. 
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Table 7 

GAGAG MOTIFS 

BASE PAIRS EXPECTED ACTUAL 

(CODING SEQUENCE NUMBER NUMBER 
OF LONGEST FORM) (1 :1 024) 



(JAPP 


2234 


2.2 


- ' 7 


Tau 


1096 


1.1 




Ubiquitin B 


687 


0.7 


2 


Apolipoprotein E4 


953 


0.9 


- 


MAP2b 


5475 


5.3 


11 


NF-low (68 K) 


582 


0.6 


3 


NF-medium (145 K) 


2748 


2.7 


3 


NF-H (200 K) 


3063 


3.1 


2 


Presenilin I 


1392 


1.4 


3 


Presenilin II 


1346 


1.3 


3 


Big Tau 


2058 


2 


5 


GFAP 


1299 


1.3 


6 


P53 


1239 


1.2 


2 


BCL2 


717 


0.7 


1 


Semaphorin III 


2313 


2.3 


4 


HUPF-I 


3351 


3.3 


3 


HMG 


327 


0.3 


1 


NSP-A 


2268 


2.3 


2 
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Table 9 

+1 protein sequences (right) predicted by a dinucleotide deletion in 
an mRNA molecule encoding for different proteins (left) 



BAPP* 1 


RGRTSSKELA 


[SEQ 


ID NO: 


1] 


Tau* 1 


HGRLAPARHAS 


[SEQ 


ID NO: 


2] 


Ubi-B* 1 


YADLREDPDRQ 


[SEQ 


ID NO: 


3] 


Ubi-B* 1 


GGGAQ 


[SEQ 


ID NO: 


51 


Utai-B* 1 


RQDHHPGSGAQ 


[SEQ 


ID NO: 


4] 


Uhi-B +1 
uuio 


YADLREDPDRQDHHPGSGAQ 


[SEQ 


ID NO: 


1400] 


Anrv-F- 


GAPRLPPAQAA 


[SEQ 


ID NO: 


61 


MAP2B* 1 


KTRFQRKGPS 


[SEQ 


ID NO: 


7] 


Neurofilament-L* 1 


PGNRSMGHE 


[SEQ 


ID NO: 


8] 


Neurofilament-M* 1 


EAEGGSRS 


[SEQ 


ID NO: 


9] 


Neurofilament-H* 1 


VGAARDSRAA 


[SEQ 


ID NO: 


10] 


Presenilin I* 1 


HDYPPGGSV 


[SEQ 


ID NO: 


11] 


Presenilin l +1 


SIQKFQV 


[SEQ 


ID NO: 


12] 


Presenilin II* 1 


VEKPGERGGR 


[SEQ 


ID NO: 


13] 


Big Tau* 1 


PLFGRGHKRG 


[SEQ 


ID NO: 


14] 


GFAP* 1 


EDRGDAGWRGH 


[SEQ 


ID NO: 


15] 


P53* 1 


QERGASPRAAPREH 


[SEQ 


ID NO: 


16] 


BCL2* 1 


RQPGDVAPGGQHRPVDD 


[SEQ 


ID NO: 


17] 


Semaphorin III* 1 


AGLLAIPEAK 


[SEQ 


ID NO: 


18] 


Semaphorin III* 1 


YVDVYNGGKFS 


[SEQ 


ID NO: 


19] 


HUPF-I* 1 


AADERRCHLLHMCGRR 


[SEQ 


ID NO: 


20] 


HUPF-I +1 


QQATEAGQHYQPGSPLHDHSHV 


[SEQ 


ID NO: 


21] 


HMG* 1 


PQEAAARTNR 


[SEQ 


ID NO: 


22] 


NSP-A* 1 


RSWVHPAPPYQMCLG 


[SEQ 


ID NO: 


23] 


NSP-A* 1 


GGSRTHPR 


[SEQ 


ID NO: 


24] 
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Table 10 



altered 



GAG AGG 
Glu Arg 



565 



GAA CGU 
Glu Arg 



GAG AGG 
Glu Arg 



GAG AGG 
Glu Arg 



CGA GAG 
Arg Glu 



1133 



CGU GAA 
Arg Glu 



AUG AGA GAA 
Met Arg Glu 



19 11S7 

AUG CGC GAA 
Met Arg Glu 



GAG AGA 
Glu Arg 



1266 



1271 



GAA CGC 
Glu Arg 
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CLAIMS 



What is claimed is: 

1. A method for the diagnosis of a disease caused by or associated with an RNA molecule having a 
transcript mutation giving rise to a frameshift mutation comprising: 

i. providing a biological sample from a patient suspected of having or developing said disease; and 

ii. detecting in said sample the presence of a mutant RNA molecule having a frameshift mutation or 
a protein encoded thereby, 

wherein detection is indicative of the disease. 

2. The method of claim 1, wherein the frameshift mutation comprises a deletion or an insertion of a 
nucleotide. 

3. The method of claim 2, wherein the frameshift mutation is associated with the nucleotide sequence 
GAGA or CTCT. 

4. The method of claim 3, wherein the frameshift mutation comprises a dinucleotide mutation 
associated with a nucleotide sequence comprising GAGA or CTCT. 

5. The method of claim 3, wherein said sequence comprises GAGAX or CTCTX, where X is one of G, 
A, T, or C. 

6. The method of claim 3, wherein said sequence comprises one of GAGAC, CTCTG, GAGAG or 



7. The method of claim 1, wherein the disease is cancer or a neurodegenerative disease. 

8. The method of claim 7, wherein the disease is Alzheimer's disease or Downs' Syndrome; frontal lobe 



CTCTC. 



81 



WO 98/45322 



PCT/IB98/00705 



dementia (Pick's Disease); progressive supranuclear palsy (PSP) and other diseases with abundant 
tau-positive filamentous lesions selected from the group that includes Corticobasal degeneration, 
Dementia pugilistica, Dementia with tangles only, Dementia with tangles and calcification, 
Frontotemporal dementias with Parkinsonism linked to chromosome 17, Gertsmann-Strassler- 
Scheinker disease with tangles, Myotonic dystrophy, Niemann-Pick disease type C, Parkinsonism- 
dementia complex of Guam, Postencephalitic Parkinsonism and Subacute sclerosing panencephalitis; 
Parkinson's disease; amyotrophic lateral sclerosis; Huntington's Disease; multiple sclerosis; dementr 
with Lewy bodies; multisystem atrophy; other inclusion body diseases associated with ubiquitin 
selected from the group that includes Alexander's disease, Alcoholic liver disease, lichen 
amyloidosis, and the presence of Marinesco bodies and Hyaline inclusions; Diabetes mellitus type II; 
and other degenerative diseases. 

9. The method of claim 1 wherein the RNA having a frameshift mutation would, if containing a 
wildtype sequence, encode the P amyloid precursor protein, the Tau protein, ubiquitin, 
apolipoprotein-E 4 (Apo-E 4 ), microtubule associated protein II (MAP 2), the neurofilament proteins 
(L, M, H), presenilin I protein, presenilin II protein, Big Tau, GFAP, P53, BCL2, sernaphorin III, 
HUPF-I, HMG and NSP-A. 

10. The method of claim 1 wherein the biological sample comprises body fluid or tissue. 

1 1 . The method of claim 10 wherein said body fluid comprises cerebral spinal fluid or blood. 

12. The method of claim 10, wherein the tissue comprises skin or nose epithelium. 

13. The method of claim 1, wherein the mutant RNA molecule is detected by formation of a nucleic acid 
duplex wherein a first strand of said duplex comprises a nucleic acid probe having a sequence 
complementary to part of the mutant RNA molecule encompassing the mutation giving rise to the 
frameshift mutation, and the second strand of said duplex comprises a nucleic acid sequence of the mutant 
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RNA molecule which is complementary to said probe. 

14. The method of claim 1, wherein the mutant RNA molecule is detected using RT-PCR to reverse 
transcribe the mutant RNA molecule and then to amplify at least a fragment of the reverse transcribed 
DNA corresponding to the mutant RNA molecule, the mutant RNA molecule encompassing the mutation 
giving rise to the frameshift, and then probing for the amplified fragment using a nucleic acid probe 
having a sequence complementary to part of the reverse transcribed DNA encompassing the mutation 
giving rise to the frameshift mutation, or by sequencing the amplified fragment. 

15. The method of claim 1, wherein the protein encoded by the mutant RNA molecule is detected using 
an antibody molecule having specificity for the mutant protein and not for the wild-type protein. 

16. A method for identifying diseases caused by or associated with an RNA molecule having a transcript 
mutation giving rise to a frameshift mutation comprising: 

i. providing the sequence of an RNA molecule suspected of being involved in the pathogenesis of a 
disease; 

ii. identifying the sequence of the mutant protein encoded by the RNA sequence 3'-terminal to a 
frameshift mutation; 

iii. preparing a probe to the mutant protein or a fragment thereof; and 

iv. probing a biological sample from a patient having the disease and a biological sample from a 
patient not having the disease, 

wherein the presence of said mutant protein in a biological sample from a patient having the disease and 
the absence of said mutant protein in a biological sample from a patient not having the disease indicates 
that the presence of the mutant protein in a biological sample is a marker for the disease or susceptibility 
to the disease. 

1 7. A diagnostic kit for diagnosing a disease caused by or associated with an RNA molecule having a 
transcript mutation giving rise to a frameshift mutation, the kit comprising: 
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L a labeled nucleic acid probe having a sequence complementary to part of the mutant RNA 

molecule which encompasses the mutation which leads to the frameshift mutation; and 
ii. packaging materials therefor. 

18. A diagnostic kit for diagnosing a disease caused by or associated with an RNA molecule having a 
transcript mutation giving rise to a frameshift mutation comprising: 

L a pair of primers for use in an RT-PCR reaction, wherein said pair comprises sequences 
complementary to sequences on either side of the mutation which gives rise to the frameshift 
mutation, and reagents necessary for performing an RT-PCR reaction; and 

ii. packaging materials therefor. 

19. A diagnostic kit for diagnosing a disease caused by or associated with at least one RNA molecule 
having one or more transcript mutations giving rise to a frameshift mutation comprising: 

i. an antibody molecule having specificity for the mutant protein encoded by the mutant RNA and 
not the wild-type protein; and 

ii. packaging materials therefor. 

20. A recombinant RNA molecule having a frameshift mutation, as described in of any one of claims 1 to 
9. 

21 . The RNA molecule of claim 20 encoding at least part of the protein sequence designated +1 or +2 
shown in any one of Figures 2-19. 

22. A mutant protein encoded by the RNA of claim 20 or 21. 

23. An immunogenic fragment of the mutant protein of claim 22. 

24. The mutant protein of claim 22 or the immunogenic fragment of claim 23, comprising the amino acid 
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sequence: 



RGRTSSKELA 


[SEQ 


ID 


NO: 


i]; 


HGRLAPARHAS 


[SEQ 


ID 


NO: 


2]; 


YADLREDPDRQ 


[SEQ 


ID 


NO: 


3]; 


RQDHHPGSGAQ 


[SEQ 


ID 


NO: 




YADLREDPDRQDHHPGSGAQ 


[SEQ 


ID 


NO: 


1400]; 


GGGAQ 


[SEQ 


ED 


NO: 


5]; 


GAPRLPPAQAA 


[SEQ 


ID 


NO: 


6]; 


KTRFQRKGPS 


[SEQ 


ID 


NO: 


7]; 


PGNRSMGHE 


[SEQ 


ID 


NO: 


8]; 


EAEGGSRS 


[SEQ 


ED 


NO: 


9]; 


VGAARDSRAA 


[SEQ 


ID 


NO: 


10]; 


HDYPPGGSV 


[SEQ 


ID 


NO: 


1 1]; 


SIQKFQV 


[SEQ 


ID 


NO: 


12]; 


VEKPGERGGR 


[SEQ 


ID 


NO: 


13]; 


PLFGRGHKRG 


[SEQ 


ID 


NO: 


14]; 


EDRGDAGWRGH 


[SEQ 


ID 


NO: 


15]; 


QERGASPRAAPREH 


[SEQ 


ID 


NO: 


16]; 


RQPGDVAPGGQHRPVDD 


[SEQ 


ID 


NO: 


17]; 


AGLLA1PEAK 


[SEQ 


ID 


NO: 


18]; 


YVDVYNGGKFS 


[SEQ 


ID 


NO: 


19]; 


AADERRCHLLHMCGRR 


[SEQ 


ID 


NO: 


20]; 


QQATEAGQHYQPGSPLHDHSHV 


[SEQ 


ID 


NO: 


21]; 


PQEAAARTNR 


[SEQ 


ID 


NO: 


22]; 


RSWVHPAPPYQMCLG 


[SEQ 


ID 


NO: 


23]; or 


GGSRTHPR 


[SEQ 


ID 


NO: 


24]. 



25. A pharmaceutical composition comprising a ribozyme that selectively cleaves a target RNA having a 
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GAGA or CTCT admixed with a pharmaceutically acceptable carrier. 

26. A pharmaceutical composition comprising a ribozyme that selectively cleaves a target RNA having a 
GAGA or CTCT and a wild-type analog of an RNA having a GAGA sequence giving rise to a 
frameshift mutation admixed with a pharmaceutically acceptable carrier. 

27. A pharmaceutical composition comprising a wild-type analog of an RNA having a GAGA or CTCT 
sequence giving rise to a frameshift mutation admixed with a pharmaceutically acceptable carrier. 

28. The pharmaceutical composition of claim 27 wherein said wild-type analog of an RNA comprises a 
nucleotide sequence having third base silent mutations. 

29. A pharmaceutical composition comprising a single stranded nucleic acid having a sequence that is 
complementary to an RNA having one or more GAGA or CTCT mutations giving rise to a frameshift 
mutation admixed with a pharmaceutically acceptable carrier. 

30. A pharmaceutical composition comprising the wild-type analog of a mutant protein in admixture with 
a pharmaceutically acceptable carrier. 

31. A vector comprising an expressible gene encoding a ribozyme that selectively cleaves a target RNA 
having a GAGA or CTCT. 

32. A vector comprising an expressible gene encoding a sequence complementary to an RNA having a 
GAGA or CTCT mutation giving rise to a frameshift mutation. 

33. A host cell containing a vector as described in claim 3 1 or 32. 

34. A method of treatment and/or prevention of a disease caused by or associated with an RNA having a 
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GAGA or CTCT mutation giving rise to a frameshift mutation, comprising administering the 
composition of any one of claims 25-30, the vector of claim 3 1 or 32, or the host cell of claim 33 to a 
patient suffering from or susceptible to the disease. 

35. The use of a vector encoding a ribozyme that selectively cleaves a target RNA having a GAGA or 
CTCT under the control of a promoter in therapy. 

36. The use of a vector encoding a ribozyme under the control of a promoter in the manufacture of a 
composition for the treatment of a disease caused by or associated with at least one an RNA having 
one or more GAGA or CTCT mutations giving rise to a frameshift mutation. 

37. The use of a vector encoding the sequence complementary to an RNA having one or more GAGA or 
CTCT mutations giving rise to a frameshift mutation under the control of a promoter in therapy. 

38. The use of more than one of the composition of any one of claims 25-30, the vector of claim 31 or 
32, or the host cell of claim 33 in any combination in therapy. 

39. The use of more than one of the composition of any one of claims 25-30, the vector of claim 31 or 
32, or the host cell of claim 33 in any combination in the treatment and/or prevention of a disease 
caused by or associated with at least one an RNA having one or more GAGA or CTCT mutations 
giving rise to a frameshift mutation. 
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Figure 1 

Paraffin section (6 fxm thick) of the frontal cortex of a female Alzheimer 
patient (age 70 years) immunocytochemically incubated with an antibody 
against a peptide predicted by the +1 reading frame of pAPP. The 
hallmarks of AD: dystrophic neurites (arrowheads) (A) and tangles 
(arrows) are clearly visible in cortical layer III. RGRTSSKELA = Amy* 1 (see 
Table 9) . 
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Figure 2-1 p amyloid precursor protein 

(Linear) MAP of: Seq check: 6510 from: 147 to: 2300 

LOCUS HUMAFPA4 3353 bp ss-mRNA PRI 15-JUN-1989 

DEFINITION Human amyloid A4 mRNA, complete cds . 

ACCESSION Y00264 

KEYWORDS amyloid fibril protein; cell surface glycoprotein. 

SOURCE human {Homo sapiens) . 

ORGANISM Homo sapiens . . . 

With 1 enzymes : NOTI 

September 14, 1993 11:31 

ATGCTGCCCGGTTTGGCACTGCTCCTGCTGGCCGCCTGGACGGCTCGGGCGCTGGAGGTA 

X47 + + + + + + 206 

TACGACGGGCCAAACCGTGACGAGGACGACCGGCGGACCTGCCGAGCCCGCGACCTCCAT 

a CCPVWHCSCWPPGRLGRWRY- 
b AAR FGTAPAGRLDGS GAGGT- 

C ML PGIjALiIiLi LAAWTARALEV 

CCCACTGATGGTAATGCTGGCCTGCTGGCTGAACCCCAGATTGCCATGTTCTGTGGCAGA 

207 + + -- + + + + 266 

GGGTGACTACCATTACGACCGGACGACCGACTTGGGGTCTAACGGTACAAGACACCGTCT 

a PLMVMLACWLNPRLPCSVAD 

b H*W*CWPAG*TPDCHVLWQT- 

C PTDGNAGLLAEPQI AMFCGR 

CTGAACATGCACATGAATGTCCAGAATGGGAAGTGGGATTCAGATCCATCAGGGACCAAA 

2 67 + + + + + + 326 

GACTTGTACGTGTACTTACAGGTCTTACCCTTCACCCTAAGTCTAGGTAGTCCCTGGTTT 

a *TCT*MSRMGSGIQIHQGPK- 
b EHAHECPEWEVGFRS IRDQN- 

C LNMHMNVQNGKWDSDPSGTK 

ACCTGCATTGATACCAAGGAAGGCATCCTGCAGTATTGCCAAGAAGTCTACCCTGAACTG 

327 + + + + + + 386 

TGGACGTAACTATGGTTCCTTCCGTAGGACGTCATAACGGTTCTTCAGATGGGACTTGAC 

a PALI PRKAS CS IAKKSTLNC 

b LH * YQGRH PAVLPRS LP * TA- 

c TCIDTKEG I LQYCQEVYPEL 

CAGATCACCAATGTGGTAGAAGCCAACCAACCAGTGACCATCCAGAACTGGTGCAAGCGG 

387 + + + + + 446 

GTCTAGTGGTTACACCATCTTCGGTTGGTTGGTCACTGGTAGGTCTTGACCACGTTCGCC 

a RSPMW*KPTNQ*PSRTGASG- 
b DHQCGRSQPTSDHPELVQAG- 
c Q I TNVV EAN Q PVTI QNWCKR 
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Figure 2-2 

GGCCGCAAGCAGTGCAAGACCCATCCCCACTTTGTGATTCCCTACCGCTGCTTAGTTGGT 
447 + + + + + + 506 

CCGGCGTTCGTCACGTTCTGGGTAGGGGTGAAACACTAAGGGATGGCGACGAATCAACCA 

a AASSARPIPTL * FPTAA*LV 

b PQAVQDPSPLCDSLPLL SW*- 

c GRKQCKTHPHFVIPYRCLVG 

GAGTTTGTAAGTGATGCCCTTCTCGTTCCTGACAAGTGCAAATTCTTACACCAGGAGAGG 

507 + + + + + + 566 

CTCAAACATTCACTACGGGAAGAGCAAGGACTGTTCACGTTTAAGAATGTGGTCCTCTCC 

a SL + VMPFSFLTSANSYTRRG- 

b VCK*CPSRS*QVQILTPGED- 
c EFVSDALLVPDKCKFLHQER 

ATGGATGTTTGCGAAACTCATCTTCACTGGCACACCGTCGCCAAAGAGACATGCAGTGAG 

567 + + + + + + 626 

TACCTACAAACGCTTTGAGTAGAAGTGACCGTGTGGCAGCGGTTTCTCTGTACGTCACTC 

a WMFAKL I FTGT P S PKRHAVR 

b GCLRNSSSLAHRRQRDMQ * E - 

c MDVCETHLHWHTVAKETCSE 

AAGAGTACCAACTTGCATGACTACGGCATGTTGCTGCCCTGCGGAATTGACAAGTTCCGA 

627 + + + + + + 686 

TTCTCATGGTTGAACGTACTGATGCCGTACAACGACGGGACGCCTTAACTGTTCAAGGCT 

a RVPTCMTTACCCPAELTSSE 

b EYQLA * LRHVAALRN * QVPR- 

C KSTNLHDYGMLLPCGIDKFR 

GGGGTAGAGTTTGTGTGTTGCCCACTGGCTGAAGAAAGTGACAATGTGGATTCTGCTGAT 

687 + + + + + + 746 

CCCCATCTCAAACACACAACGGGTGACCGACTTCTTTCACTGTTACACCTAAGACGACTA 

a G * SIiCVAHWLKKVTMWILLM- 

b GRVCVLPTG*RK*QCGFC*C- 
C GVEFVCCPLAEESDNVDSAD 

GCGGAGGAGGATGACTCGGATGTCTGGTGGGGCGGAGCAGACACAGACTATGCAGATGGG 

747 + - -- + + + + + 80S 

CGCCTCCTCCTACTGAGCCTACAGACCACCCCGCCTCGTCTGTGTCTGATACGTCTACCC 

a RRRMTRMSGGAEQTQTMQMG 

b GGG* LGCLVGRSRHRLCRWE- 

c AEEDDSDVWWGGADTDYADG 

AGTGAAGACAAAGTAGTAGAAGTAGCAGAGGAGGAAGAAGTGGCTGAGGTGGAAGAAGAA 

807 + + + + + + - 866 

TCACTTCTGTTTCATCATCTTCATCGTCTCCTCCTTCTTCACCGACTCCACCTTCTTCTT 
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Figure 2-3 

a VKTK**K*QRRKKWLRWKKK- 
b * RQSSRSSRGGRSG * G G R R R - 

C SEDKVVEVAEEEEVAEVEEE 

GAAGCCGATGATGACGAGGACGATGAGGATGGTGATGAGGTAGAGGAAGAGGCTGAGGAA 

867 + + + + + + 926 

CTTCGGCTACTACTGCTCCTGCTACTCCTACCACTACTCCATCTCCTTCTCCGACTCCTT 

a KPMMTRTMRMVMR*RKRLRN- 
b SR**RGR*GW**GRGRG*GT- 
c EADDDEDDEDGDEVEEEAEE 

CCCTACGAAGAAGCCACAGAGAGAACCACCAGCATTGCCACCACCACCACCACCACCACA 

927 --- + + + + + + 

GGGATGCTTCTTCGGTGTCTCTCTTGGTGGTCGTAACGGTGGTGGTGGTGGTGGTGGTGT 

a PTKKPQREPPALPPPPPPPQ- 
b LRRSHRENHQHCHHHHHHHR- 
c PYEEATERTTS IATTTTTTT 

GAGTCTGTGGAAGAGGTGGTTCGAGTTCCTACAACAGCAGCCAGTACCCCTGATGCCGTT 

937 + + + + + + 1046 

CTCAGACACCTTCTCCACCAAGCTCAAGGATGTTGTCGTCGGTCATGGGGACTACGGCAA 

a SLWKRWFEFLQQQPVPLMPL- 
b VCGRGGSSSYNSSQYP*CR*- 
c ESVEEVVRVPTTAASTPDAV 

GACAAGTATCTCGAGACACCTGGGGATGAGAATGAACATGCCCATTTCCAGAAAGCCAAA 

10 47 + + + + + H06 

CTGTTCATAGAGCTCTGTGGACCCCTACTCTTACTTGTACGGGTAAAGGTCTTTCGGTTT 

a TS ISRHLGMRMNMPISRKPK- 

b QVSRDTWG * E * TCPFPESQR- 

c DKYLETPGDENEHAHFQKAK 

GAGAGGCTTGAGGCCAAGCACCGAGAGAGAATGTCCCAGGTCATGAGAGAATGGGAAGAG 

1107 --- + + - + + + + " II 66 

CTCTCCGAACTCCGGTTCGTGGCTCTCTCTTACAGGGTCCAGTACTCTCTTACCCTTCTC 

a RGLRPSTERECPRS * ENGKR 

b EA*GQAPRENVP GHERMGRG- 

C ERLEAKHRERMSQVMREWEE 

GCAGAACGTCAAGCAAAGAACTTGCCTAAAGCTGATAAGAAGGCAGTTATCCAGCATTTC 

1167 + + + + + + 1226 

CGTCTTGCAGTTCGTTTCTTGAACGGATTTCGACTATTCTTCCGTCAATAGGTCGTAAAG 

a QNVKQRTCLKLIRRQLSSIS- 
b R ' T S " S "K E - A *S**EGSYPAFP- 

C AERQAK NLPKADKKAVIQHF 
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Figure 2-4 

CAGGAGAAAGTGGAATCTTTGGAACAGGAAGCAGCCAACGAGAGACAGCAGCTGGTGGAG 

1227 + + + + + + 1286 

GTCCTCTTTCACCTTAGAAACCTTGTCCTTCGTCGGTTGCTCTCTGTCGTCGACCACCTC 

a RRKWNLWNRKQPTRDSSWWR- 
b G E S G I FGTG S S QRETAAG GD- 

c QEKVES LEQEAANERQQLVE 

ACACACATGGCCAGAGTGGAAGCCATGCTCAATGACCGCCGCCGCCTGGCCCTGGAGAAC 

1287 + + + + + + 1346 

TGTGTGTACCGGTCTCACCTTCGGTACGAGTTACTGGCGGCGGCGGACCGGGACCTCTTG 

a HTWPEWKPCSMTAAAWPWRT 

b THGQSGSHAQ *-PPPPGPGEL- 

c THMARVEAMLNDRRRLALEN 

TACATCACCGCTCTGCAGGCTGTTCCTCCTCGGCCTCGTCACGTGTTCAATATGCTAAAG 

1347 + + + + + + 1406 

ATGTAGTGGCGAGACGTCCGACAAGGAGGAGCCGGAGCAGTGCACAAGTTATACGATTTC 

a TSPLCRLFLLGLVTC SIC*R- 

b HHRSAGCSS SAS SRVQYAKE- 

c YITALQAVPPRPRHVFMMIiK 

AAGTATGTCCGCGCAGAACAGAAGGACAGACAGCACACCCTAAAGCATTTCGAGCATGTG 

1407 + + + + + + 1466 

TTCATACAGGCGCGTCTTGTCTTCCTGTCTGTCGTGTGGGATTTCGTAAAGCTCGTACAC 

a SMSAQNRRTDSTP*SISSMC- 
b VCPRRTEGQTAHPKAFRACA- 
c KYVRAEQKDRQHTLKHFEHV 

CGCATGGTGGATCCCAAGAAAGCCGCTCAGATCCGGTCCCAGGTTATGACACACCTCCGT 

1467 + + + + + - + 1526 

GCGTACCACCTAGGGTTCTTTCGGCGAGTCTAGGCCAGGGTCCAATACTGTGTGGAGGCA 

a AWWIPRKPLRSGPRL*HTSV- 
b HGGSQESRSDPVPGYDTPPC- 
c RMVDPKKAAQ IRSQVMTHLR 

GTGATTTATGAGCGCATGAATCAGTCTCTCTCCCTGCTCTACAACGTGCCTGCAGTGGCC 

1527 + + + + + + 1586 

CACTAAATACTCGCGTACTTAGTCAGAGAGAGGGACGAGATGTTGCACGGACGTCACCGG 

a *FMSA*ISIiSPCS-TTCLQWP- 
b DL*AHESVSLPALQRACSGR- 
c VIYERMNQSLSLLYNVPAVA 

GAGGAGATTCAGG ATGAAGTTG ATGAGCTG CTT C AG AAAGAGC AAAACTATTCAGATG AC 

1587 + + + + + + 1646 

CTCCTCTAAGTCCTACTTCAACTACTCGACGAAGTCTTTCTCGTTTTGATAAGTCTACTG 
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Figure 2-5 

a RRFRMKLMSCFRKSKTIQMT- 
b GDSG*S**AASERAKLFR*R- 
c EEIQDEVDELLQKEQNYSDD 

GTCTTGGCCAACATGATTAGTGAACCAAGGATCAGTTACGGAAACGATGCTCTCATGCCA 

1647 + + + + + + - 1706 

CAGAACCGGTTGTACTAATCACTTGGTTCCTAGTCAATGCCTTTGCTACGAGAGTACGGT 

a SWPT * LVNQGSVTETMLSCH 

b LGQHD * *TKDQLRKRCSHAI- 

c VLANM I SEPR I SYGNDALM P 

TCTTTGACCGAAACGAAAACCACCGTGGAGCTCCTTCCCGTGAATGGAGAGTTCAGCCTG 

1707 + + + + + + 1766 

AGAAACTGGCTTTGCTTTTGGTGGCACCTCGAGGAAGGGCACTTACCTCTCAAGTCGGAC 

a L*PKRKPPWSSFP*MESSAW- 
b FDRNENHRGAPSREWRVQPG- 
c SLTETKTTVELLPVNGEFSL 

GACGATCTCCAGCCGTGGCATTCTTTTGGGGCTGACTCTGTGCCAGCCAACACAGAAAAC 

1767 + + + + + + 1826 

CTGCTAGAGGTCGGCACCGTAAGAAAACCCCGACTGAGACACGGTCGGTTGTGTCTTTTG 

a TISSRGILLGLTLCQPTQKT- 
b RS PAVAFFWG * LCAS QHRKR- 

c DDLQPWHSFGADSVPANTEN 

GAAGTTGAGCCTGTTGATGCCCGCCCTGCTGCCGACCGAGGACTGACCACTCGACCAGGT 

1827 + + + + + + 1886 

CTTCAACTCGGACAACTACGGGCGGGACGACGGCTGGCTCCTGACTGGTGAGCTGGTCCA 

a KLSLLMPALLPTED* PLDQV- 

b S*AC*CPPCCRPRTDHSTRF- 
c EVE PVDARPAADRGLTTRPG 

TCTGGGTTGACAAATATCAAGACGGAGGAGATCTCTGAAGTGAAGATGGATGCAGAATTC 

1887 + + - + + - + + 1946 

AGACCCAACTGTTTATAGTTCTGCCTCCTCTAGAGACTTCACTTCTACCTACGTCTTAAG 

a LG*QISRRRRSLK*RWMQNS- 
b WVD KYQDGGDL * S EDGCR I P - 

c SGLTNI KTEEISEVKMDAEF 

CGACATGACTCAGGATATGAAGTTCATCATCAAAAATTGGTGTTCTTTGCAGAAGATGTG 

1947 + + + + + + 2006 

GCTGTACTGAGTCCTATACTTCAAGTAGTAGTTTTTAACCACAAGAAACGTCTTCTACAC 

a DMTQDMKFI IKNWCSLQKMW- 

b T*LRI*SSSSKIGVIiCRRCG- 
c RHDSGYEVHHQKbVFFAEDV 
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Figure 2-6 

GGTTCAAACAAAGGTGCAATCATTGGACTCATGGTGGGCGGTGTTGTCATAGCGACAGTG 
2007 --- + + + + + + 2066 

CCAAGTTTGTTTCCACGTTAGTAACCTGAGTACCACCCGCCACAACAGTATCGCTGTCAC 

a VQTKVQSIiDSWWAVLS*RQ*- 
b PKQRCNHWTHGGRCCHSD.SD- 
c GSNKGAI IGLMVGGVVIATV 

ATCGTCATCACCTTGGTGATGCTGAAGAAGAAACAGTACACATCCATTCATCATGGTGTG 
2067 --- + + + + + + 2126 

TAGCAGTAGTGGAACCACTACGACTTCTTCTTTGTCATGTGTAGGTAAGTAGTACCACAC 

a SSSPW*C*RRNSTHPFIMVW- 
b RHHLGDAEEETVHIHS SWCG- 

c IVITLVMLKKKQYTSIHHGV 

GTGGAGGTTGACGCCGCTGTCACCCCAGAGGAGCGCCACCTGTCCAAGATGCAGCAGAAC 
2127 + + + + + + 2186 

CACCTCCAACTGCGGCGACAGTGGGGTCTCCTCGCGGTGGACAGGTTCTACGTCGTCTTG 

a WRLTPLS PQRSATCPRCSRT- 

b GG * RRCHPRGAPPVQDAABR- 

c VEVDAAVTPEERH.LSKMQQN 

GGCTACGAAAATCCAACCTACAAGTTCTTTGAGCAGATGCAGAACTAGACCCCCGCCACA 

2187 + + + - + + + 2246 

CCGATGCTTTTAGGTTGGATGTTCAAGAAACTCGTCTACGTCTTGATCTGGGGGCGGTGT 

a ATKIQPTSS LSRCRTRPPPQ- 

b LRKSNLQVL *ADAELDPRHS- 

c GYENPTYKFFEQMQN*TPAT 

GCAGCCTCTGAAGTTGGACAGCAAAACCATTGCTTCACTACCCATCGGTGTCCA 

2247 + + + + + + 2300 

CGTCGGAGACTTCAACCTGTCGTTTTGGTAACGAAGTGATGGGTAGCCACAGGT 

a QPLKLDSKTIASLPIGV 
b SL*SWTAKPLIiHYPSVS 
C AASEVGQQNHCFTTHRC P 

Enzymes that do cut: 



NOKE 

Enzymes that do not cut: 
Not I 
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Figure 3-1 Tau 

(Linear) MAP of: Seq check: 9711 from: 38 to: 1096 

RL ; HSTAUA - Human microtubule- associated protein tau mRNA, complete cds 

ID HSTAUA standard; RNA; PRI ; 1108 BP. 

XX 

AC J03778; 
XX 

DT 04 -OCT- 198 8 (Rel. 17, Created) . . . 
With 1 enzymes: NOTI 

September 14, 1993 12:12 
ATGGCTGAGCCCCGCCAGGAGTTCGAAGTGATGGAAGATCACGCTGGGACGTACGGGTTG 



38 — — -t- + + + - --- + ---- + 97 

TACCGACTCGGGGCGGTCCTCAAGCTTCACTACCTTCTAGTGCGACCCTGCATGCCCAAC 

a G * A P PGVRS DGRSRWDVRVG- 

b MAEPRQEFEVMEDHAGTYGL 

c W L S PARSSK*WKITLGRTGW- 

GGGGACAGGAAAGATCAGGGGGGCTACACCATGCACCAAGACCAAGAGGGTGACACGGAC 

98 + + + + - + + 157 

CCCCTGTCCTTTCTAGTCCCCCCGATGTGGTACGTGGTTCTGGTTCTCCCACTGTGCCTG 

a GQERSGGLHHAPRPRG*HGR- 
b GDRKDQGGYTMHQDQEGDTD 
C GTGK I RGATPCTKTKRVTRT 

GCTGGCCTGAAAGCTGAAGAAGCAGGCATTGGAGACACCCCCAGCCTGGAAGACGAAGCT 

158 + + + + + + 217 

CGACCGGACTTTCGACTTCrTCGTCCGTAACCTCTGTGGGGGTCGGACCTTCTGCTTCGA 

a W .PES*RSRHWRHPQPGRRSC- 

b AGLKAEEAGIGDTPSLEDEA 

c LA* KLKKQALETPPAWKTKL- 

GCTGGTCACGTGACCCAAGCTCGCATGGTCAGTAAAAGCAAAGACGGGACTGGAAGCGAT 

218 + + + + + 277 

CGACCAGTGCACTGGGTTCGAGCGTACCAGTCATTTTCGTTTCTGCCCTGACCTTCGCTA 

a W S RD P S SHGQ * KQRRDWKR*- 

b AGHVTQARMVSKS KDGTGSD 

c L V "t * PKLAWSVKAKTGLEAM- 
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Figure 3-2 

GACAAAAAAGCCAAGGGGGCTGATGGTAAAACGAAGATCGCCACACCGCGGGGAGCAGCC 
2 78 + + + + + ___ + 

CTGTTTTTTCGGTTCCCCCGACTACCATTTTGCTTCTAGCGGTGTGGCGCCCCTCGTCGG 



a QKSQGG*W*NEDRHTAGSSP 
b D KKAKGADGKTKI ATPRGAA 

c T KKPRGLMVKRRS P HRGEQP 



CCTCCAGGCCAGAAGGGCCAGGCCAACGCCACCAGGATTCCAGCAAAAACCCCGCCCGCT 
GGAGGTCCGGTCTTCCCGGTCCGGTTGCGGTGGTCCTAAGGTCGTTTTTGGGGCGGGCGA 



a SRPEGPGQRHQDSSKNPARS- 

b P PGQKGQANATRI PAKTPPA 

C LQARRARPTPPGFQQKPRPL- 



CCAAAGACACCACCCAGCTCTGGTGAACCTCCAAAATCAGGGGATCGCAGCGGCTACAGC 
GGTTTCTGTGGTGGGTCGAGACCACTTGGAGGTTTTAGTCCCCTAGCGTCGCCGATGTCG 



a KDTTQLW*TS KIRGSQRLQQ 

b PKTPPSSGEPPKSGDRSGYS 
c Q RHH PALVNLQNQG IAAATA 



AGCCCCGGCTCCCCAGGCACTCCCGGCAGCCGCTCCCGCACCCCGTCCCTTCCAACCCCA 
TCGGGGCCGAGGGGTCCGTGAGGGCCGTCGGCGAGGGCGTGGGGCAGGGAAGGTTGGGGT 



a PRLPRHSRQPLPHPVPSNPT- 

b SPGSPGTPGSRSRTPSLPTP 

C APAPQALPAAAPAPRPFQPH- 

CCCACCCGGGAGCCCAAGAAGGTGGCAGTGGTCCGTACTCCACCCAAGTCGCCGTCTTCC 
518 + + + + + + 577 

GGGTGGGCCCTCGGGTTCTTCCACCGTCACCAGGCATGAGGTGGGTTCAGCGGCAGAAGG 

a H PGAQEGGS G PYSTQVAVFR- 

b PTREPK KVAVyRTPPKSPSS 

c PPGSPRRWQWSVLHPSRRLP- 



GCCAAGAGCCGCCTGCAGACAGCCCCCGTGCCCATGCCAGACCTGAA.GAATGTCAAGTCC 
CGGTTCTCGGCGGACGTCTGTCGGGGGCACGGGTACGGTCTGGACTTCTTACAGTTCAGG 



% % 
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Figure 3-3 

a QE P PAD S P RAHARPEECQVQ- 

b AKSRLQTAPVPMPDLKN VKS 

c PRAACRQPPCPCQT*RMSSP- 

AAGATCGGCTCCACTGAGAACCTGAAGCACCAGCCGGGAGGCGGGAAGGTGCAAATAGTC 

638 + + + + + + G97 

TTCTAGCCGAGGTGACTCTTGGACTTCGTGGTCGGCCCTCCGCCCTTCCACGTTTATCAG 

a DRLH* E PEAPAGRREGANSL- 

b KIGSTENLKHQPGGGKVQIV 
c RSAPLRT* STSREAGRCK* S 

TACAAACCAGTTGACCTGAGCAAGGTGACCTCCAAGTGTGGCTCATTAGGCAACATCCAT 

698 + + + + + + 757 

ATGTTTGGTCAACTGGACTCGTTCCACTGGAGGTTCACACCGAGTAATCCGTTGTAGGTA 

a QTS * PEQGDLQVWLIRQHPS- 

b YKPVDLS KVTSKCGSLGNIH 

c TNQLT*AR* PPSVAH*ATSI 

CATAAACCAGGAGGTGGCCAGGTGGAAGTAAAATCTGAGAAGCTTGACTTCAAGGACAGA 

758 + + + + + + 817 

GTATTTGGTCCTCCACCGGTCCACCTTCATTTTAGACTCTTCGAACTGAAGTTCCTGTCT 

a * TRRWPGGS KI * E A * LQGQS- 

b HKPGGGQVEVKSEKLDFKDR 
C INQEVARWK*NLRSLTSRTE 

GTCCAGTCGAAGATTGGGTCCCTGGACAATATCACCCACGTCCCTGGCGGAGGAAATAAA 

818 - - + + + + + + 877 

CAGGTCAGCTTCTAACCCAGGGACCTGTTATAGTGGGTGCAGGGACCGCCTCCTTTATTT 

a PVEDWVPGQYHPRPWRRK* K - 

b VQSKIGSLDNITHVPGGGNK 

c SSRRLGPWTISPTSLAEEIK- 

AAGATTGAAACCCACAAGCTGACCTTCCGCGAGAACGCCAAAGCCAAGACAGACCACGGG 

878 + + + + - + + 937 

TTCTAACTTTGGGTGTTCGACTGGAAGGCGCTCTTGCGGTTTCGGTTCTGTCTGGTGCCC 

a D * NPQADLPRERQSQDR PRG- 

b KIETHKLTFRENAKAKTDHG 

c RLKPTS * PSARTPKPRQTTG- 
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Figure 3-4 

GCGGAGATCGTGTACAAGTCGCCAGTGGTGTCTGGGGACACGTCTCCACGGCATCTCAGC 
938 - - + + -- + + + ■-- + 997 

CGCCTCTAGCACATGTTCAGCGGTCACCACAGACCCCTGTGCAGAGGTGCCGTAGAGTCG 

a GDRVQVASGVWGHVSTASQQ- 

b AEIVYKSPVVSGDTSPRHLS 

C RRSCTSRQWCLGTRLHGISA- 

AATGTCTCCTCCACCGGCAGCATCGACATGGTAGACTCGCCCCAGCTCGCCACGCTAGCT 
998 -- + + + + + + 1057 

TTACAGAGGAGGTGGCCGTCGTAGCTGTACCATCTGAGCGGGGTCGAGCGGTGCGATCGA 

a CLLHRQHRHGRLiAPARHAS * - 

b NVSSTGSIDMVDSPQLATLA 

C MSPPPAA 'STW*TRPSSPR*L- 

GACGAGGTGTCTGCCTCCCTGGCCAAGCAGGGTTTGTGA 

1058 + + + + 1096 

CTGCTCCACAGACGGAGGGACCGGTTCGTCCCAAACACT 

a RGVCLPGQAGFV 
b DEVSASLAKQGL* 
c TRCLPPWPSRVC 

Enzymes that do cut : 

NONE 

Enzymes that do not cut: 
Not I 
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Figure 4-1 Ubiquitin B 

(Linear) MAP of: Seq check: 2987 from: 1094 to: 1800 

LOCUS HUMYUBG1 2118 bp ds-DNA PRI 15 -MAR- 1988 

DEFINITION Human ubiquitin gene (3 repeats) . 

ACCESSION X04803 

KEYWORD S ubiquitin. 

SOURCE human (Homo sapiens) . 

ORGANISM Homo sapiens . . . 

With 1 enzymes : NOTI 

September 14, 1993 11:58 

ATGCAGATCTTCGTGAAAACCCTTACCGGCAAGACCATCACCCTTGAGGTGGAGCCCAGT 

1094 + + + + + + 1153 

TACGTCTAGAAGCACTTTTGGGAATGGCCGTTCTGGTAGTGGGAACTCCACCTCGGGTCA 

a AD LRENPYRQDHHP * G G A Q * - 

b MQIFVKTLTGKTITLEVEPS 

c CRSS*KPLPARPSPLRWSPV- 

GACACCATCGAAAATGTGAAGGCCAAGATCCAGGATAAGGAAGGCATTCCCCCCGACCAG 

1154 + + + + + + 1213 

CTGTGGTAGCTTTTACACTTCCGGTTCTAGGTCCTATTCCTTCCGTAAGGGGGGCTGGTC 

a HHRKCEGQDPG * GRHSPRPA- 

b DTI ENVKA. KIQDKEGI PPDQ 

c TPSKM* RPRSRIRKAFPPT S 

CAGAGGCTCATCTTTGCAGGCAAGCAGCTGGAAGATGGCCGTACTCTTTCTGACTACAAC 

1214 + + + + + + 1273 

GTCTCCGAGTAGAAACGTCCGTTCGTCGACCTTCTACCGGCATGAGAAAGACTGATGTTG 

a EAHLCRQAAGRW PYS F * LQH- 

b QRL I FAGKQLEDGRTLSDYN 

c RGS SLQASSWKM AVLFLTTT- 

ATCCAGAAGGAGTCGACCCTGCACCTGGTCCTGCGTCTGAGAGGTGGTATGCAGATCTTC 

1274 + + + + + + 1333 

TAGGTCTTCCTCAGCTGGGACGTGGACCAGGACGCAGACTCTCCACCATACGTCTAGAAG 

a P E GVD PA.PG PA S E RWYADLR- 

b I QKESTLHLVLRLRGGMQI F 

C SRRSRPCTWSCV*EVVCRSS- 
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Figure 4-2 

GTGAAGACCCTGACCGGCAAGACCATCACCCTGGAAGTGGAGCCCAGTGACACCATCGAA 
1334 - — T - + + + + + „ + .„ 1393 

CACTTCTGGGACTGGCCGTTCTGGTAGTGGGACCTTCACCTCGGGTCACTGTGGTAGCTT 
R Q \ 

a EDPDRQDHHPGSGAQ* HHRK- 

b VKTLTGKTITLEVEPSDTIE 

C *RP*PARPSPWKWSPVTPSK- 



AATGTGAAGGCCAAGATCCAGGATAAAGAAGGCATCCCTCCCGACCAGCAGAGGCTCATC 
TTACACTTCCGGTTCTAGGTCCTATTTCTTCCGTAGGGAGGGCTGGTCGTCTCCGAGTAG 



a CEGQDPG* RRHPSRPAEAHL- 

b NVKAKIQDKEGI PPDQQRLI 

c M * RPRSRI KKASLPTSRGS S 



TTTGCAGGCAAGCAGCTGGAAGATGGCCGCACTCTTTCTGACTACAACATCCAGAAGGAG 

h f- h + H 1 

AAACGTCCGTTCGTCGACCTTCTACCGGCGTGAGAAAGACTGATGTTGTAGGTCTTCCTC 



a CRQAAGRWPHS F * LQHPEGV- 

b FAGKQLEDGRTLSDYNIQKE 
c LQASSWKMAALFLTTTSRRS 



TCGACCCTGCACCTGGTCCTGCGTCTGAGAGGTGGTATGCAGATCTTCGTGAAGACCCTG 
+ + + + + h 

AGCTGGGACGTGGACCAGGACGCAGACTCTCCACCATACGTCTAGAAGCACTTCTGGGAC 



a D PA PGPAS ERWYADLRED P D - 

b STLHLVLRLRGGMQI FVKTL 

C RPCTWS"CV*EVVCRSS*RP*- 



ACCGGCAAGACCATCACTCTGGAGGTGGAGCCCAGTGACACCATCGAAAATGTGAAGGCC 
+ + + + _ i + + 

TGGCCGTTCTGGTAGTGAGACCTTCACCTCGGGTCACTGTGGTAGCTTTTACACTTCCGG 



a R Q D H H S GG GA Q * H HRKC E G Q - 

b TGKT IT. LEVEPSDTI ENVKA 

c PARPSLWKWSPVTPSKM*RP- 



AAGATCCAAGATAAAGAAGGCATCCCTCCCGACCAGCAGAGGCTCATCTTTGCAGGCAAG 

+ + h + + + 

TTCTAGGTTCTATTTCTTCCGTAGGGAGGGCTGGTCGTCTCCGAGTAGAAACGTCCGTTC 
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Figure 4-3 

a DPR* RRHPSRPAEAHLCRQA- 

b KIQDKEGI PPDQQRLIFAGK 

c RSKI KKASLPTSRGSSLQAS- 

CAGCTGGAAGATGGCCGCACTCTTTCTGACTACAACATCCAGAAGGAGTCGACCCTGCAC 

1694 + + + + + + 1753 

GTCGACCTTCTACCGGCGTGAGAAAGACTGATGTTGTAGGTCTTCCTCAGCTGGGACGTG 

a AGRWPHSF * LQHPEGVDPAP- 

b QLiEDGRTLSDYNIQKESTLH 
c SWKMAALFLTTTSRRSRPCT 

CTGGTCCTGCGCCTGAGGGGTGGCTGTTAATTCTTCAGTCATGGCAT 

1754 + + + + + 1800 

GACCAGGACGCGGACTCCCCACCGACAATTAAGAAGTCAGTACCGTA 

a GPAPEGWLLILQSWH 
b LVLRLRGGC* FFSHG 

c WSCA*GVAVNSSVMA 

Enzymes that do cut: 

NONE 

Enzymes that do not cut : 
Not I 
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Figure 5-1 Apolipoprotein E 

(Linear) MAP of: hsapoeOl.gcg check: 2800 from: 1 to: 1157 

ID HSAPOE01 standard; RNA; HUM; 1157 BP. 
XX 

AC M12529; 
XX 

NI gl78848 

XX . . . 

With 1 enzymes: NOTI 

October 31, 1996 15:09 

ccccagcggaggtgaaggacgtccttccccaggagccgactggccaatcacaggcaggaa 
1 + + + + + + 6Q 

g999tcgcctccacttcctgcaggaaggggtcctcggctgaccggttagtgtccgtcctt 

a PQRR* RTSFPRSRLANHRQE 

b PSGGEGRPS PGADWPITGRK- 

c PAEVKDVLPQE PTGQSQAGR- 

gatgaaggttctgtgggctgcgttgctggtcacattcctggcaggatgccaggccaaggt 
61 + + + + + + 120 

ctacttccaagacacccgacgcaacgaccagtgtaaggaccgtcctacggtccggttcca 

a DEGSVGCVAGHI PGRMPGQG 

b MKVLWAALLVTFLAGCQAKV- 

C *RFCGLRCWSHSWQDARPRW- 

ggagcaagcggtggagacagagccggagcccgagctgcgccagcagaccgagtggcagag 
121 + + + + _ + + 

cctcgttcgccacctctgtctcggcctcgggctcgacgcggtcgtctggctcaccgtctc 

a GASGGDRAGARAAPADRVAE 

b EQAVETEPEPELRQQTEWQS 

c SKRWRQSRSPSCASRPSGRA- 

cggccagcgctgggaactggcactgggtcgcttttgggattacctgcgctgggtgcagac 
181 + - + + + + + 240 

gccggtcgcgacccttgaccgtgacccagcgaaaaccctaatggacgcgacccacgtctg 

a R PALGTGTGSLLGLPALGAD 

b GQRWELALGRF WDYLRWVQT- 

c ASAGNWHWVAFG I TCAGCRH- 
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Figure 5-2 

actgtctgagcaggtgcaggaggagctgctcagctcccaagtcacccaagaactgagggc 

24i + + + + + + 300 

tgacagactcgtccacgtcctcctcgacgagtcgagggttcagtgggttcttgactcccg 

a TV * AGAGGAAQLPSHPRTEG 

b LSEQVQEELLSSQVTQELRA- 

c CLSRCRRSCSAPKSPKN*GR- 

gctgatggacgagaccatgaaggagttgaaggcctacaaatcggaactggaggaacaact 

301 + + + + + + 360 

cgactacctgctctggtacttcctcaacttccggatgtttagccttgacctccttgttga 

a ADGRDHEGVEGLQ I GTGGTT 

b LMDETMKELKAYKSELEEQL- 

c *WTRP*RS*RPTNRNWRNN*- 

gaccccggtagcggaggagacgcgggcacggctgtccaaggagctgcagacggcgcaggc 

361 + + + + + + 420 

ctggggccatcgcctcctctgcgcccgtgccgacaggt tcctcgacgtctgccgcgtccg 

a D PGS GGDAGTAVQGAADGAG 

b TPVAEETRARLS KELQTAQA 

c PR * RRRRGHGCPRS CRRRRP- 

N 
o 
t 
I 

ccggctgggcgcggacatggaggacgtgtgcggccgcctggtgcagtaccgcggcgaggt 

421 + + + + + + 480 

ggccgacccgcgcctgtacctcctgcacacgccggcggaccacgtcatggcgccgctcca 

a PAGRG HGGRVRPPG AVPRRG 

b RLGADMEDVCGRLVQYRGEV 

c GWA RTWRTCAAAW C S TAARC- 

gcaggccatgctcggccagagcaccgaggagctgcgggtgcgcctcgcctcccacctgcg 

481 + + + + + + 540 

cgtccggtacgagccggtctcgtggctcctcgacgcccacgcggagcggagggtggacgc 

a AGHAR PEHRGAAGAPRLPPA 

b QAMLGQSTEELRV RLASHLR 

C RPCSARAPRSCGCASPPTCA- 



WO 98/45322 PCT/IB98/00705 



17/169 



Figure 5-3 

caagctgcgtaagcggctcctccgcgatcccgatgacctgcagaagcgcctggcagtgta 
541 + + + + + + 

gttcgacgcattcgccgaggaggcgctagggctactggacgtcttcgcggaccgtcacat 



a QAA*AAPPRSR* PAEAPGSV 

b KLRKRLLRDPDDLQKRLAVY 
c SCVSGSSAIPMTCRSAWQCT 



ccaggccggggcccgcgagggcgccgagcgcggcctcagcgccatccgcgagcgcctggg 
+ + + + 1 + 

ggtccggccccgggcgctcccgcggctcgcgccggagtcgcggtaggcgctcgcggaccc 



a PGRGPRGRRARPQRH PRAPG 

b QAGA REGAERGLSAI RERLG 

c R PGPARAPSAASAPSASAWG 



gcccctggtggaacagggccgcgtgcgggccgccactgtgggctccctggccggccagcc 

+ 4 h + h + 

cggggaccaccttgtcccggcgcacgcccggcggtgacacccgagggaccggccggtcgg 



a APGGTGPRAGRHCGL PGR PA 

b PLVEQGRVRAATVGS LAGQP- 

c PWWNRAACGPPLWAPWPASR- 

gctacaggagcgggcccaggcctggggcgagcggctgcgcgcgcggatggaggagatggg 
721 + + + + + + 780 

cgatgtcctcgcccgggtccggaccccgctcgccgacgcgcgcgcctacctcctctaccc 

a ATGAG PGLGRAAARADGGDG 

b LQERAQAWGERLRARMEEMG- 

C YRSGPRPGASGCARGWRRWA- 

cagtcggacccgcgaccgcctggacgaggtgaaggagcaggtggcggaggtgcgcgccaa 
781 + + + + + + 840 

gtcagcctgggcgctggcggacctgctccacttcctcgtccaccgcctccacgcgcggtt 

a QSD-PRPPGRGEGAGGGGARQ 

b SRTRDRLDEVKEQV ( AEVRAK- 

c VGPATAWTR*RSRWRRCAPS- 



gctggaggagcaggcccagcagatacgcctgcaggccgaggccttccaggcccgcctcaa 
' + + + + + + 

cgacctcctcgtccgggtcgtctatgcggacgtccggctccggaaggtccgggcggagtt 
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Figure 5-4 

a AGGAG PADTPAGRGL PGP PQ 

b LEEQAQQI RLQAEAFQARLK- 

c WRSRPSRYACRPRPSRPASR- 

gagctggttcgagcccctggtggaagacatgcagcgccagtgggccgggctggtggagaa 

901 + + + + + + 960 

ctcgaccaagctcggggaccaccttctgtacgtcgcggtcacccggcccgaccacctctt 

a ELVRAPGGRHAAPVGRAGGE 

b SWFEPLVEDMQRQWAGLVEK- 

c AGSSPWWKTCSASGPGWWRR- 

ggtgcaggctgccgtgggcaccagcgccgcccctgtgcccagcgacaatcactgaacgcc 

961 + + + + + + 1020 

ccacgtccgacggcacccgtggtcgcggcggggacacgggtcgctgttagtgacttgcgg 

a GAGCRGHQRRPCAQRQSLNA 

b VQAAVGTSAAPVPSDNH * T P 

c CRL PWAPAP PLCPATITERR- 

gaagcctgcagccatgcgaccccacgccaccccgtgcctcctgcctccgcgcagcctgca 

1021 --- + + --■- + + + + 1080 

cttcggacgtcggtacgctggggtgcggtggggcacggaggacggaggcgcgtcggacgt 

a EACSHATPRHPVPPASAQPA 

b KPAAMRPHATPCLLPPRSLQ- 

c SLQPCDPTPPRASCLRAACS- 

gcgggagaccctgtccccgccccagccgtcctcctggggtggaccctagtttaataaaga 

1081 + + + + + + 1140 

cgccctctgggacaggggcggggtcggcaggaggaccccacctgggatcaaattatttct 

a AGDPVPAPAVLLGWTLV* *R 

b RETLSPPQPSSWGGP* FNKD- 

c GRPCPRPSRPPGVDPSLIKI- 

ttcaccaagtttcacgc 

1141 + 1157 

aagtggttcaaagtgcg 

a F T K F H 

b S P S F T 

c H Q V S R 

Enzymes that do cut: NotI 
Enzymes that do not cut : NONE 
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Figure 6-1 MAP 2 

(Linear) MAP of : hsmap2c.gcg check: 22 01 from: 1 to : 5595 

ID HSMAP2C standard; RNA; HUM; 5 595 BP. 
XX 

AC L12563; 
XX 

NI g348216 

XX . . . 

Wi th 1 enzymes : NOTI 



atggctgacgagcggaaagacgaaggaaaggcacctcactggacctcagcaccgctaaca 

+ + + + + + 6Q 

taccgactgctcgcctttctgcttcctttccgtggagtgacctggagtcgtggcgattgt 

MADERKDEGKAPHWTSAPLT 
WLTSGKTKERHLTGPQHR*Q- 
G * RAERRRKG TS LDLSTANR- 

gaggcatctgcacactcacatccacctgagattaaggatcaaggcggagcaggggaagga 

+ + + + + + 12Q 

ctccgtagacgtgtgagtgtaggtggactctaattcctagttccgcctcgtccccttcct 

EASAHSHPPEI KDQGGAGEG 
RHLHTH IHLRLRI KAEQGKD 
G I C TLTST* D * GSRRSRGRT- 

cttgtccgaagcgccaatggattcccatacagggaggatgaagagggtgcctttggagag 

+ + + + + + 1Q0 

gaacaggcttcgcggttacctaagggtatgtccctcctacttctcccacggaaacctctc 

LVRSANGFPYREDEEGAFGE 
LSEAPMDSHTGRMKRVPLES - 
CPKRQWI P I QGG * RGCLWRA- 

catgggtcacagggcacctattcaaataccaaagagaatgggatcaacggagagctgacc 

+ .-- + + + + + 240 

gtacccagtgtcccgtggataagtttatggtttctcttaccctagttgcctctcgactgg 

HGSQGTYSNTKENGINGELT 
M GHRAPIQI PKRMGSTES* P - 
WVTGHLFKYQREWDQRRADL- 
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Figure 6-2 



tcagctgacagagaaacagcagaggaggtgtctgcaaggatagttcaagtagtcactgct 
241 + + + + + - + 300 

agtcgactgtctctttgtcgtctcctccacagacgttcctatcaagttcatcagtgacga 

a SADRETAEEVSARI VQVVTA 

b QLTEKQQRRCLQG*FK*SLL- 

c S*QRNSRGGVCKDSSSSHC*- 

gaggctgtagcagtcctgaaaggtgaacaagagaaagaagctcaacataaagaccagact 

301 + + + + + + 360 

ctccgacatcgtcaggactttccacttgttctctttcttcgagttgtatttctggtctga 

a EAVAVLKGEQEKEAQHKDQT 

b RL» * QS * KVNKR KKLNI KTRL 

c GCSSPER*TRERSST*RPDC- 

gcagctctgcctttagcagctgaagaaacagctaatctgcctccttctccacccccatca 

361 + + + + + + 420 

cgtcgagacggaaatcgtcgacttct ttgtcgatt agacggaggaagaggtgggggtagt 

a AALPLAAEETANLPPSPPPS 

b QLCL,*QLKKQLICLLLHPHH- 

c_ SSAFSS*RNS*SASFSTPIT- 

cctgcctcagaacagactgtcacagtggaggaagcctcgaagatggagttccacgatcaa 

421 + + + + + + 480 

9g^cggagtcttgtctgacagtgtcacctccttcggagcttctacctcaaggtgctagtt 

a PA S E QTVTV E EA S KM E FHD Q 

b bPQNRLSQWRKPRRW SSTIN- 

c CLRTDCHSGGSLEDGVPRST- 

caggaattgactccctctacagctgagccttcagaccagaaggaaaaggagtcagagaag 
481 + + + + + + 54Q 

gtccttaactgagggagatgtcgactcggaagtctggtcttccttttcctcagtctcttc 

a QELTPSTAEPSDQKEKESEK 

b R N * LPLQLSLQTRRKRSQRS 

c GI DSLYS *AFRPEGKGVREA- 

caaagtaagcctggtgaagaccttaaacatgctgccttagtttctcagccagagacaact 
541 ■+- ■+* + + + + 600 



gtttcattcggaccacttctggaatttgtacgacggaatcaaagagtcggtctctgttga 
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Figure 6-3 

a QSKPGEDLKHAALVSQPETT 

b KVSLVKT LNMLP* FLSQRQL- 

c K*AW*RP*TCCLS FSARDN*- 

aaaactt accctgataaaaaggacatgcaaggcacggaagaagaaaaagcacccctagct 

601 + + + + + + ego 

ttttgaatgggactatttttcctgtacgttccgtgccttcttctttttcgtggggatcga 

a KTYPDKKDMQGTEEEKAPLA 

b KLTLIKRTCKARKKKKHP*L- 

c NLP**KGHARHGRRKSTPSF- 

ttgtttgggcacactcttgttgccagcctggaagacatgaaacagaagacagaaccaagc 

661 h h + + + + 720 

aacaaacccgtgtgagaacaacggtcggaccttctgtactttgtcttctgtcttggttcg 

a LFGHTL.VASL.EDMKQKTEPS 

b CLGTLiLLPAWKT* NRRQNQA- 

c VWAHSCCQPGRHETEDRTKP- 

cttgtagtacctggcattgacctccctaaagagcctccaactccaaaagaacaaaaggac 

721 + + + + + + 780 

gaacatcatggaccgtaactggagggatttctcggaggttgaggttttcttgttttcctg 

a LVVPGIDLPKEPPTPKEQKD 

b L * YLALTSLKSLQLiQKNKRT- 

c C S T W H * P P * RASNS KRTKGL- 

tggttcatcgaaatgccaacggaagcaaaaaaggatgagtggggtttagttgcccccata 

781 + + + + + + 840 

accaagtagctttacggttgccttcgttttttcctactcaccccaaatcaacgggggtat 

a WF I EMPTEAKKDEWGLVAPI 

b GSSKCQRKQKRMSGV*LPPY- 

c VHRNANGSKKG*VGFSCPHI- 

tctcctggccctctgactcccatgagggaaaaagatgtatttgatgatatcccaaaatgg 

841 ~ + + + + + + 900 

agaggaccgggagactgagggtactccctttttctacataaactactatagggttttacc 

a S PGPLTPMREKDVFDD I PKW 

b LLAL*LP*GKKMYLMISQNG- 

c SWPSDSHEGKRCI**YPKMG- 
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Figure 6-4 

gaagggaaacagtttgattctcccatgccaagtccctttcaagggggaagcttcactctt 



901 + + + + + + 960 

cttccctttgtcaaactaagagggtacggttcagggaaagttcccccttcgaagtgagaa 

a. EGKQFDS PMPS PFQGGSFTL 

b^ KGNSLI LPCQVPFKGEASLF- 

R E T V * FSHAKSLSRGKLHS S - 

cctttagatgtcatgaagaatgaaatagttacagaaacatcgccctttgcccctgccttt 

961 + + + + + + 1020 

ggaaatctacagtacttcttactttatcaatgtctttgtagcgggaaacggggacggaaa 

a PLDVMKNE I VTETS P FAPAF 

b L*MS*RMK* LQKHRPLPLPF- 

c FRCHEE * NSYRNIALCPCLF- 

t tacagccagatgacaaaaaatctctgcaacaaaccagtggcccagctactgccaaagat 

1021 + + + + + + 1080 

aatgtcggtctactgttttttagagacgttgtttggtcaccgggtcgatgacggtttcta 

a LQPDDKKSLQQTSGPATAKD 

b YSQ MTKNLCNKPVAQLLPKI- 

c. T A R * Q K I S ATNQWPS YCQR*- 

agttttaaaattgaagagccccatgaggctaaacctgacaaaatggcagaagcaccaccc 

1081 + + + + + + 1140 

tcaaaattttaacttctcggggtactccgatttggactgttttaccgtcttcgtggtggg 

a SFKI EEPHEAKPDKMAEAPP 

b VLKLKS PMRLNLTKWQKHHP 

c F*N*RAP*G*T*QNGRSTTL- 

tcagaggcaatgaccttacccaaagatgctcacattccagttgtagaagaacatgttatg 

1141 + + + + + + 1200 

agtctccgttactggaatgggtttctacgagtgtaaggtcaacatcttcttgtacaatac 

a SEAMTLPKDAH I PVVEEHVM 

b QRQ * PY. PKMLTFQL* KNMLW - 

c RGNDLTQRC S HS SCRRTCYG- 

gggaaagttt tagaggaagaaaaggaggccataaatcaagagactgtgcagcaaagggat 

1201 + + +- + + + 1260 

ccctttcaaaatctccttcttttcctccggtatttagctctctgacacgtcgtttcccta 
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Figure 6-5 

a GKVLEEEKEAINQETVQQRD 

b GKF * RKK RR P * I KRLCSKG I 

c ES FRGRKGGHKSRDCAAKGY- 

actttcacccccagtggacaggaacctatacttactgaaaaggaaactgagctgaagctt 
1261 + + + + + + 1320 

tgaaagtgggggtcacctgtccttggatatgaatgacttttcctttgactcgacttcgaa 

a TFTPSGQEP I LTEKETELKL 

b LSPPVDRNLYLLKRKLS*SL- 

C FH PQWTGTYTY * KGN * A E A * - 

gaagaaaaaaccaccatttctgacaaagaagctgtgccaaaagagagtaaacccccaaaa 
1321 + + + + + + 1380 

cttcttttttggtggtaaagactgtttcttcgacacggttttctctcatttgggggtttt 

a EEKTTI SDKEAVPKESKPPK 

b KKKPPFLTKKLCQKRVNPQN- 

c RKNHHF*QRS CAKRE * TPKT- 

cctgcagatgaagaaataggcataattcagacctccacagagcacactttctcagaacag 

1381 + + + + + + 1440 

ggacgtctacttctttatccgtattaagtctggaggtgtctcgtgtgaaagagtcttgtc 

a PADEEIGI IQTSTEHTFSEQ 

b LQMKK*A* FRPPQSTLSQNR- 

c CR * RNRHNSDLHRAHF LRTE- 

aaagaccaagagcctaccacagatatgttgaaacaggactcgttccctgtaagtttggag 

1441 + + + + + + 1500 

tttctggttctcggatggtgtctatacaactttgtcctgagcaagggacattcaaacctc 

a KDQEPTTDMLKQDSFPVSLE 

b KTKSLPQIC*NRTRSL*VWS- 

c RPRAYHRYVETGLVPCKFGA- 

caagcagttacagattcagccatgacctctaaaacactggagaaagccatgaccgaacca 

1501 + + + + + + 1560 

gttcgtcaatgtctaagtcggtactggagattttgtgacctctttcggtactggcttggt 

a QAVTDSAMTS KTLEKAMTEP 

b KQLQIQP* PLKHWRKP* PNH- 

c S S YRFSHDL * NTGESHDRTI- 
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Figure 6-6 

tctgcattaattgaaaagagctcaattcaggaactttttgaaatgagagttgatgacaaa 

1561 + + + + + + 1620 

agacgtaattaacttttctcgagttaagtccttgaaaaactttactctcaactactgttt 

a SALIEKSSIQELFEMRVDDK 

b L H * L K R A Q F R N F L K * E L M T K - 

c CIN*KELNSGTF*NES**QR- 

gataagattgaaggagttggagctgcaacatcagctgagcttgatatgccattttatgaa 

1621 + + + + + + 1680 

ctattctaacttcctcaacctcgacgttgtagtcgactcgaactatacggtaaaatactt 

a DKI EGVGAATSAELDMPFYE 

b I RLKELELQHQLSLICHFMK- 

c *D*RSWSCNIS*A*YAIL*R~ 

gataaatcaggaatgtccaagtactttgaaacatctgccttgaaagaagaagcaacaaaa 

1681 + + + + + + 1740 

ctatttagtccttacaggttcatgaaactttgtagacggaactttcttcttcgttgtt tt 

a DKSGMSKYFETSALKEEATK 

b INQECPSTLKHLP*KKKQQK- 

c * I RNVQVL* N I CLERRSNKK- 

agcactgagccaggcagtgattactatgaactgagtgacactagagaaagtgtccatgag 

1741 + + + + + + 1800 

tcgtaactcggtccgtcactaatgatacttgactcactgtgatctctttcacaggtactc 

a S I E PGSDYYELSDTRESVHE 

b ALSQAVITMN*VTLEKVSMS- 

c H * ARQ * LL * T E * H * RKC P * V - 

tctattgataccatgtctcccatgcataaaaatggtgacaaggagtttcaaacaggaaaa 

1801 + + + + + 1860 

agataactatggtacagagggtacgtatttttaccactgttcctcaaagtttgtcctttt 

a S I DTMSPMHKNGDKEFQTGK 

b LLIPCLPCIKMVTRSFKQEK- 

c Y * YHVSHA* KW * QGVSNRKR- 

gaatcccagcccagtcctccagcacaagaagcagggtacagcactctcgcacagagttat 

1861 + + + + + + 1920 

cttagggtcgggtcaggaggtcgtgttcttcgtcccatgtcgtgagagcgtgtctcaata 
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Figure 6-7 

a E S QP S P PAQEAGYS TLA QSY 

b NPSPVLQHKKQGTALSHRVI- 

C IPAQSSSTRSRVQHSRTELS- 

ccatcagatttacctgaagaacccagttctcctcaagaaagaatgttcactattgatcca 

1921 + + + + + + 1980 

ggtagtctaaatggacttcttgggtcaagaggagttctttcttacaagtgataactaggt 

a PSDLPEEPSSPQERMFTIDP 

b HQIYLKNPVLLKKECSLLIQ- 

c I RFT* RTQFSSRKNVHY* S K - 

aaagtgtatggagagaaaagggacctccacagtaagaataaggatgatttgacccttagc 

1981 + + + + + + 2040 

tttcacatacctctcttttccctggaggtgtcattcttattcctactaaactgggaatcg 

a KVYGEKRDLiHS KNKDDLTLS 

b KCMERKGTST VRIRMI * P L A - 

c SVWREKGPPQ*E*G*FDP*Q- 

aggagtttaggacttggtggtaggtctgcaatagaacaaagaagcatgtcaatcaatttg 

2041 + + + + + + 2100 

tcctcaaatcctgaaccaccatccagacgttatcttgtttcttcgtacagttagttaaac 

a RS LGLGGRSAI EQRSMS INL 

b GV*DLVVGLQ*NKEACQSIC- 

C E FRTWW * VCNRTKKHVNQ FA- 

ccgatgtcttgcctagattccatagcccttggatttaactttggtcggggacatgatctt 

2101 + + + + + + 2160 

ggctacagaacggatctaaggtatcgggaacctaaattgaaaccagcccctgtactagaa 

a PMSCLDS IALGFNFGRGHDL 

b RCLA* IP*PLDLTLVGDMIF- 

c DVLPRFHSPWI*LWSGT*SF- 

tctcctctggcttccgatattctaaccaacactagtggaagtatggatgaaggggatgat 

2161 + -- + + + + + 2220 

agaggagaccgaaggctataagattggttgtgatcaccttcatacctacttcccctacta 

a S PLASDI LTNTSGSMDEGDD 

b LLWLPIF*PTLVEVWMKGMI- 

C S SGFRYSNQH*WKYG*RG* L - 
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Figure 6-8 

taccttccagccaccacacctgcactggagaaagccccttgcttccctgtagaaagcaaa 
2221 + + + + + + 2280 

atggaaggtcggtggtgtggacgtgacctctttcggggaacgaagggacatctttcgttt 

a YLPATTPALEKAPCFPVESK 

b TFQPPHLHWRKPLASL* K A K - 

c PS S HHTCTGES PLLPCRKQR- 

gaggaagaacagatagagaaagtaaaagctactggagaagaaagtactcaagcggagata 

2281 -h + + + + + 2340 

ctccttcttgtctatctctttcattttcgatgacctcttctttcatgagttcgcctctat 

a EEEQ IEKVKATGEESTQAEI 

b RKNR*RK* KLLEKKVLKRRY 

c GRTDRES KSYWRRKYS S G D I -' 

tcatgtgagtctcctttcctagccaaagatttttacaaaaatggtactgtcatggcacct 

2341 + + + + + + 2400 

agtacactcagaggaaaggatcggt tt ctaaaaatgttt ttaccatgacagtaccgtgga 

a SCES PFLAKDFYKNGTVMAP 

b HVSLLS* PKIFTKMVLSWHL- 

C VSFPSQRFLQKWYCHGT*- 

gaccttcctgaaatgctagatctggcaggcacaaggtcaagattggcttctgtgagtgca 

2401 + + + + + + 2460 

ctggaaggactttacgatctagaccgtccgtgttccagttctaaccgaagacactcacgt 

a DLPEMLDLAGTRSRLASVSA 

b TFLKC* IWQAQGQDWLL*VQ 

c PS * NARSGRHKVKIGFCECR- 

gatgctgaggttgccaggaggaaatcagtcccatcagagactgtggttgaggatagtcgt 

2461 + + + + + + 2520 

ctacgactccaacggtcctcctttagtcagggtagtctctgacaccaactcctatcagca 

a DAEVARRKSVPSETVVEDSR 

b MLRLPGGNQSHQRLWLRIVV- 

c C*GCQEEISPIRDCG*G*S'Y- 

accggcttgcccccggtaactgatgaaaaccatgtcattgtaaaaacggacagtcagctc 

2521 + + + + + + 2580 

tgaccgaacgggggccattgactacttttggtacagtaacatttttgcctgtcagtcgag 
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Figure 6-9 



a TGLPPVTDENHVIVKTDSQL 

b LACPR*LMKTMSL* KRTVSS- 

c WLAPGN* *KPCHCKNGQSAR- 

gaagacctgggctactgtgtgttcaataagtacacagtcccattgccatcacctgttcaa 

2581 + + + + + + 2640 

cttctggacccgatgacacacaagttattcatgtgtcagggtaacggtagtggacaagtt 

a EDLGYCVFNKYTVPLP'SP.VQ 

b KTWATVCSI STQSHCHHLFK- 

C RPGLLCVQ*VHSPIAITCSR- 

gacagtgagaatttatcaggggagagtggtaccttttacgaaggcactgacgataaagtt 

2641 + + + + + + 2700 

ctgtcactcttaaatagtcccctctcaccatggaaaatgcttccgtgactactatttcaa 

a DSENLSGESGTFYEGTDDKV 

b TVRI YQGRVVPFTKALMI K F - 

c Q * EFIRGEWYL.LRRH* * * S S - 

cgaagagatttggccacagacctttcactgattgaagtgaaactggcagcagccggaaga 
2701 + + + + + + 2760 

gcttctctaaaccggtgtctggaaagtgactaacttcactttgaccgtcgncggccttct 

a RRDLATDLSLI EVKLAAAGR 

b EEIWPQTFH*LK*NWQQPEE- 

c KRFGHRPFTD*SETGS SRKS- 

gtcaaagatgagttcagtgttgacaaagaagcatccgcgcatatctctggtgacaaatca 

2761 + + + + + + 2820 

cagtttctactcaagtcacaactgtttcttcgtaggcgcgtatagagaccactgtttagt 

a VKDEFSVDKEASAHI SGDKS 

b SKMSSVLTKKHPRISLVTNQ- 

C QR*VQC*QRSIRAYLW*QIR- 

ggactgagtaaggagtttgaccaagagaagaaagctaatgataggttggatactgtacta 

2821 + + + + + + 2880 

cctgactcattcctcaaactggttctcttctttcgattactatccaacctatgacatgat 

a GLSKEFDQEKKANDRLDTVL 

b D*VRSLTKRRKLiMIGWILY* 

C TE*GV*PREES***VGYCTR- 
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Figure 6-10 



gaaaagagtgaagaacatgctgattcaaaagaacatgccaagaaaactgaagaggctggt 
2881 + + + + + + 2940 

cttttctcacttcttgtacgactaagttttcttgtacggttcttttgacttctccgacca 

a; EKSEEHADSKEHAKKTEEAG 

kx KRVKNMLI QKNMPRKLKRLV- 

c KE * R T C * FKRTCQEN*RGW*- 

gatgaaatagaaacattcggattaggagtaacctatgagcaagctttggccaaagatttg 
2941 + + + + + --„ + 3000 

ctactttatctttgtaagcctaatcctcattggatactcgttcgaaaccggtttctaaac 

a DE I ETFGLGVTYEQALAKDL 

b MK*KHSD*E*PMSKLWPKIC- 

c *NRNIRIRSNL*ASFGQRFV- 

tcaataccaacagatgcatcctctgagaaagcagagaagggtcttagttcagttccagag 

3001 --- + + + + + + 3060 

agttatggttgtctacgtaggagactctttcgtctcttcccagaatcaagtcaaggtctc 

a S I PT DASSEKAEKGLSSVPE 

b QYQQM HPLRKQRRVLVQFQR- 

c NTNRCIL*ESREGS*FSSRD- 

atagctgaggtagaaccatccaaaaaggtggaacaaggtctggattttgctgtccagggt 

3061 + + + + + + 3120 

tatcgactccatcttggtaggtttttccaccttgttccagacctaaaacgacaggtccca 

a IAEVEPS KKVEQGLDFAVQG 

b *LR*NHPKRWNKVWILLSRV- 

c S *GRTIQKGGTRSGFCCPGS- 

caactagatgttaaaattagtgactttggacagatggcttcagggctaaacatagatgat 

3121 + + + + + + 3180 

gttgatctacaattttaatcactgaaacctgtctaccgaagtcccgatttgtatctacta 

a QLDVKI SDFGQMASGLNIDD 

b N*MLKLVTLDRWLQG*T*MI- 

c T R C * N * *LWTDGFRAKHR* * - 

agaagggcaacagagctaaaacttgaggctacacaggacatgaccccctcatccaaagca 

3181 + + + + + + 3240 

tcttcccgttgtctcgattttgaactccgatgtgtcctgtactgggggagtaggtttcgt 
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Figure 6-11 

a RRATELKLEATQDMTPSSKA 

b EGQQS*NLRLHRT*PP HPKH- 

c KGNRAKT * GYTGHDPL I QST- 

ccgcaggaggcagatgcatttatgggtgttgagtctggccacatgaaagaaggcactaaa 

3241 + + + + + + 3300 

ggcgtcctccgtctacgtaaatacccacaactcagaccggtgtactttcttccgtgattt 

a PQEADAFMGVE SGHMKEGTK 

b RRRQMHLWVLSLAT * K K A L K - 

C AGGRCIYGC*VWPHERRH*S- 

gttagtgagacagaagtcaaacagaaggtggccaagcctgacttggtgcaccaggaggct 

3301 + + + + + + 336O 

caatcactctgtcttcagtttgtcttccaccggttcggactgaaccacgtggtcctccga 

a VSETEVKQKVAKPDLVHQEA 

b L.VRQKSNRRWPSLTWCTRRL- 

c * *DRSQTEGGQA*LGAPGGC- 

gtagacaaggaggagtcct at gaatctagtggtgagcatgaaagtct caeca tggagtcc 

3361 + + + + + + 3421 

catctgttcctcctcaggatacttagatcaccactcgtactttcagagtggtacctcagg 

a VDKEESYESSGEHESLTMES 

b *TRRSPMNLVVSMKVSPWSP- 

c R QGGVL* I * W * A * KSH HGVL- 

ttgaaagctgatgagggcaagaaggaaacatctccagaatcatctctaattcaagatgag 

3421 + + + + + + 3480 

aactttcgactactcccgttcttcctttgtagaggtcttagtagagattaagttctactc 

a LKADEGKKETS PES SLIQDE 

b * KLMRARRKHLQ'NHL* FKMR 

c ES**GQEGNISRIISNSR*D- 

attgccgtcaaattgtcagtggaaataccttgcccacctgctgtttcagaggctgattta 

3481 + --- + + + + + 3540 

taacggcagtttaacagtcacctttatggaacgggtggacgacaaagtctccgactaaat 

a IAVKLSVEIPCPPAVSEADL 

b LPSNCQWKYLAHLLFQRLI * 

c CRQI VSGNTLPTCCFRG* FS- 
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Figure 6-12 

gccacagatgagagagctgatgtccagatggaatttattcaggggccaaaagaagaaagc 

3541 + + + + + + 3600 

cggtgtctactctctcgactacaggtctaccttaaataagtccccggttttcttctttcg 

a „ ATDERADVQMEFIQGPKEES 

b~_ PQMREL MSRWNLFRGQKKKA- 

c " HR * ES * ' C PDG I Y SGAKRRKQ- 

aaagagaccccagatatatccatcacgccttctgatgttgcagagccattgcatgaaacg 

360 i + + + + + + 3660 

tttctctggggtctatataggtagtgcggaagactacaacgtctcggtaacgtactttgc 

a KETPDISITPSDVAEPLHET 

b KRPQIYPS RLLMLQSHCMKR- 

c RDPRYI HHAF *CCRAIA*ND- 

atcgtatctgaaccagcagagattcagagtgaggaagaagagatagaagcccagggagaa 

3661 + + + + + + 3720 

tagcatagacttggtcgtctctaagtctcactccttcttctctatcttcgggtccctctt 

a IVSEPAEIQSEEEEI EAQGE 

b SYLNQQRFRVRKKR* KPREN- 

c RI*TSRDSE*GRRDRSPGRI- 

tatgataaactgctcttccgctcagacacccttcagataactgacctgggtgtctcaggt 

3721 + + + + + + 3780 

atactatttgacgagaaggcgagtctgtgggaagtctattgactggacccacagagtcca 

a YDKLLFRSDTLQITDLGVSG 

b MINCSSAQTPFR*LTWVSQV- 

c **TALPLRHPSDN*PGCLRC- 

gccagggaggaatttgtggagacctgcccaagtgaacacaaaggagtgattgagtctgtt 

3781 + + + + + + 3840 

cggtccctccttaaacacctctggacgggttcacttgtgtttcctcactaactcagacaa 

a AREEFVET CPSEHKGVIESV 

b PGRNLWRPAQVNTKE* LSLL- 

C QGG I CG .DLPK* TQRSD *VCC- 



gtgaccatcgaggatgatttcatcactgtagtgcaaaccacaactgatgaaggggagtca 

+ + + + + + 

cactggtagctcctactaaagtagtgacatcacgtttggtgttgactacttcccctcagt 
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Figure 6-13 

a VTI EDDFITVVQTTTDEGES 

b *PSRMISSL*CKPQLMKGSQ- 

c DHRG* FHHCSANHN* * RGVR- 

gggtcccacagcgtgcgttttgcagccctagagcagcctgaggtggaaaggagaccatct 

3901 + + + + + + 3960 

cccagggtgtcgcacgcaaaacgtcgggatctcgtcggactccacctttcctctggtaga 

a GSHSVRFAALEQPEVERRPS 

b GPTACVLQP*SSLRWKGDHL- 

C VPQRAFCSPRAA * GGKET I S - 

cctcatgatgaagaagagtttgaagtagaagaggcagctgaagcccaggcagaacccaaa 

3961 + + + + + + 4020 

ggagtactacttcttctcaaacttcatcttctccgtcgacttcgggtccgtct tgggttt 

a PHDEEEFEVEEAAEAQAEPK 

b LMMKKSLK* KRQLKPRQNPK- 

c S**RRV*SRRGS*SPGRTQR- 

gatggt tccccagaggctccagcttcccctgagagagaagaggttgcact t tctgaatat 

4021 + + + + + + 4080 

ctaccaaggggtctccgaggtcgaaggggactctctcttctccaacgtgaaagacttata 

a DGS PEAPASPEREEVALSEY 

b MVPQRLQLPLiREKRLHFLNI 

c WFPRGSSFP*ERRGCTF* I * - 

aagacagaaacctatgacgattacaaagatgagaccaccattgacgactccatcatggac 

4081 + + + + + + 4140 

ttctgtctttggatactgctaatgtttctactctggtggtaactgctgaggtagtacctg 

a KTE TYDDYKDE TT I DDS I MD 

b RQKPMTITKMRPPLTTPSWT- 

C DRNL * RLQR * DHH * RLHHGR- 

gctgacagcctctgggtggacactcaagatgatgataggagcatcatgacagaacagtta 

4141 + --- + + + + + 4200 

cgactgtcggagacccacctgtgagtt:ctactactatcctcgt.agtactgt:ctt:gtcaat 

a ADS LWVDTQDDDRS IMTEQL 

b LTASGWTLiKMMIGAS * Q N S * 

C *QPLGGHSR* * * EHHDRTVR- 
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Figure 6-14 

gaaactattcctaaagaggagaaagctgaaaaggaagctcggagatcatctcttgagaaa 

4201 + + + + + + 4260 

ctttgataaggatttctcctctttcgacttttccttcgagcctctagtagagaactcttt 

-a ETIPKEEKAEKEARRSSLEK 
b KLFLKRRKLKRKLGDHLLRN- 
c NYS*RGES*KGSSEIIS*ET~ 

catagaaaagaaaagccttttaaaaccgggagaggcagaatttccactcctgaaagaaaa 

4261 + + + + + + 4320 

gtatcttttcttttcggaaaattttggccctctccgtcttaaaggtgaggactttctttt 

a HRKEKPFKTGRGR ISTPERK 

b IEKKSLLKPGEAEFPLLKEK- 

c *KRKAF*NRERQNFHS*KKS- 

gtagctaaaaaggaacctagcacagtctccagagatgaagtgagaaggaaaaaagcagtt 

4321 + + + + + + 4380 

catcgatttttccttggatcgtgtcagaggtctctacttcactcttccttttttcgtcaa 

a VAKKE PSTVSRDEVRRKKAV 

b * LKRNLAQSPEMK* EGKKQF 

c S * KGT*HSLiQR* SEKEKSSL- 

tataagaaggctgaact tgctaaaaaaacagaagt tcaggcccactctccctccaggaaa 

4381 + + + + + + 4440 

atattcttccgacttgaacgatttttttgtcttcaagtccgggtgagagggaggtccttt: 

a YKKAELAKKTEVQAHS PSRK 

b IRRLNLLKKQKFRPTLPPGN- 

c * E G * TC* KNRSSGPLSLQEI- 

ttcattttaaaacctgctatcaaatatactagaccaactcatctctcctgtgttaagcgg 

4441 + + + + + + 4500 

aagtaaaattttggacgatagtttatatgatctggttgagtagagaggacacaattcgcc 

a FILKPAIKYTRPTHLSCVKR 

b SF*NLLSNILDQLISPVLSG- 

C HFKTCYQIY * TNSS LLC* A E - 

aaaaccacagcagcaggtggggaatcagctctggctcccagtgtatttaaacaggcaaag 

4501 + + + + + + 4560 

ttttggtgtcgtcgtccaccccttagtcgagaccgagggtcacataaatttgtccgtttc 
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Figure 6-15 

a KTTAAGGESALAPSVFKQAK 

b kpqqqvgnqlwlpvyln'rqr- 

C NHSSRWGISSGSQCI*TGKG- 

gacaaagtctctgacggagtaaccaagagcccagaaaagcgctcttctctcccaagacct 

4561 + + + + + + 4620 

ctgtttcagagactgcctcattggttctcgggtcttttcgcgagaagagagggttctgga 

a DKVSDGVTKS PEKRS SLPRP 

b TKSLTE* PRAQKSALLSQDL 

c QSL*R-SNQEPRKALFSPKTF- 

tcctccattctccctcctcggcgaggtgtgtcaggagacagagatgagaattccttctct 

4621 + + + + + + 4680 

aggaggtaagagggaggagccgctccacacagtcctctgtctctactct taaggaagaga 

a SSILPPRRGVSGDRDENSFS 

b PPFSLLGEVCQETEMRIPSL- 

C LHS PSSARCVRRQR * EFLLS- 

ctcaacagttctatctcttcttcagcacggcggaccaccaggtcagagccaattcgcaga 

4681 + + + + + + 4740 

gagttgtcaagatagagaagaagtcgtgccgcctggtggtccagtctcggttaagcgtct 

a LNSS ISSSARRTTRS EPIRR 

b STVLSLL'QHGGPPGQSQFAE 

c Q Q F Y L F FS TAD HQVRA N S Q S- 

gcagggaagagtggtacctcaacacccactacccctgggtctactgccatcactcctggc 

4741 + + + + + + 4800 

cgtcccttctcaccatggagttgtgggtgatggggacccagatgacggtagtgaggaccg 

a AGKSGTSTPTTPGSTAITPG 

b QGRVVPQHPLPLGLLPSLLA- 

c REEWYLNTHY PWVYCHHSWH- 

accccaccaagttattcttcacgcacaccaggcactcctggaacccctagctatcccagg 

4801 + : - + ■+• + + + 4860 

tggggtggttcaataagaagtgcgtgtggtccgtgaggaccttggggatcgatagggtcc 

a TPPSYSSRTPGTPGTPSYPR 

b PHQVI LHAHQALLE PLjAI PG 

c PTKLFFTHTRHSWN P * LSQD- 



% 
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Figure 6-16 

acccctcacacaccaggaacccccaagtctgccatcttggtgccgagtgagaagaaggtc 

4861 + + + + + + 4920 

tggggagtgtgtggtccttgggggttcagacggtagaaccacggctcactcttcttccag 

a TPHTPGTPKSAILVPSEKKV 

b PLTHQEPPSLPSWCRVRRRS- 

c PSHTRNPQVCHLGAE* E E G R - 

gccatcatacgtactcctccaaaatctcctggactgactcccaagcagcttcggcttatt 

4921 + + - - + + + + 4980 

cggtagtatgcatgaggaggttttagaggacctgactgagggttcgtcgaagccgaataa 

a AIIRTPPKSPGLTPKQLRLI 

b PSYVLLQNLLD* LPSSFGLL- 

c HHTYSSKI SWTDSQAASAY * - 

aaccaaccactgccagacctgaagaatgtcaaatccaaaatcggatcaacagacaacatc 

4981 + + + + + + 5040 

ttggttggtgacggtctggacttcttacagtttaggttttagcctagttgtctgttgtag 

a NQPLPDLKNVKSKIGSTDNI 

b TNHCQT* RMSNPKSDQQTTS 

c PTTARPEECQ I QNRINRQHQ- 

aaataccagcctaaaggggggcaggtacaaattgttaccaagaagatagacctaagccat 

5041 + + + + + + 5100 

tttatggtcggatttccccccgtccatgtttaacaatggttcttctatctggattcggta 

a KYQPKGGQVQIVTKKIDLSH 

b NTSLKGGRYKLLPRR*T*AM- 

c I PA* RGAGTNCYQEDRPKPC- 

gtgacatccaaatgtggctctctgaagaacatccgccacaggccaggtggcggacgtgtg 

5101 + + + + + + 5160 

cactgtaggtttacaccgagagacttcttgtaggcggtgtccggtccaccgcctgcacac 

a VTSKCGSLKNI RHRPGGGRV 

b * HPNVAL* RTSATGQVADV* 

c DIQMWLSEEHPPQARWRTCE- 

aaaattgagagtgtaaaactagatttcaaagaaaaggcccaagctaaagttggttctctt 

5161 + - + + + + + 5220 

ttttaactctcacattttgatctaaagtttcttttccgggttcgatttcaaccaagagaa 
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Figure 6-17 



a KIESVKL, DFKEKAQAKVGSL 

b K L R V * N * ISKKRPKLKLVLL- 

c N*ECKTRFQRKGPS * S W F S * - 

gataatgctcatcatgtacctggaggtggtaatgtcaagattgacagccaaaagttgaac 
5221 + -f + + + + 5280 

ctattacgagtagtacatggacctccaccattacagttctaactgtcggttttcaacttg 

a DNAHHVPGGGNVKI DSQKLM 

b I ML IMYLEVVMSRLTAKS * T - 

c *CSSCTWRW*CQD*QPKVEL- 

ttcagagagcatgctaaagcccgtgtggaccatggggctgagatcattacacagtcccca 

5281 + + + + + + 5340 

aagtctctcgtacgatttcgggcacacctggtaccccgactctagtaatgtgtcaggggt 

a FREHAKARVDHGAEI ITQS P 

b SESMLKPVWTMGLRSLHSPQ- 

c QRAC*SPCGPWG*DHYTVPR- 

ggcagatccagcgtggcatcaccccgacgactcagcaatgtctcctcgtctggaagcatc 

5341 + + + + + + 540C 

ccgtctaggtcgcaccgtagtggggctgctgagtcgttacagaggagcagaccttcgtag 

a GRSSVASPRRLSNVSSSGSI 

b ADPAWHHPDDSAMS PRLEAS 

C Q I QRGITPTTQQCLLVWKHQ- 

aacctgctcgaatctcctcagcttgccactttggctgaggatgtcactgctgcactcgct 
5401 + + + + + + 5460 

ttggacgagcttagaggagtcgaacggtgaaaccgactcctacagtgacgacgtgagcga 

a NLiLES PQLATLAEDVTAALA 

b TCSNLLSLPLWLRMSLLHSL- 

c PAR I S SACHFG * GCHCCTR * - 

aagcagggcttgtgaatatttctcatttagcattgaaataataatatttaggcatgagct 
5461 + + + + + + 5520 

ttcgtcccgaacacttataaagagtaaatcgtaactttattattataaatccgtactcga 

a KQGL* I F L I *H*NNNI*A*A - . 

b SRACEYFSFSIEII IFRHEL- 

c AGLVNISHLALK**YIiGMSS- 
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Figure 6-18 

cttggcaggagtgggctctgagcagttgttatatcattctttataaaccataaaataaat 

5521 + + + + + + 5580 

gaaccgtcctcacccgagactcgtcaacaatatagtaagaaatatttggtattttattta 

a LGRSGL* AVV I S FF I N H K IN 

b LAGVGSEQLLYHSL*TIK* I 

c WQEWALS S CYI I LYKP * . N K * - 

aatctcccggaattc 

5581 + 5595 

ttagagggccttaag 

a N L P E F 

b I S R N 

c S P G I - 

Enzymes that do cut : 

NONE 

Enzymes that do not cut: 
NotI 



WO 98/45322 PCT/IB98/00705 

37/169 

Figure 7-1 Neurofilament L 

(Linear) MAP of: hsnflg.gcg check: 5926 from; 1 to: 4682 

ID HSNFLG standard; DNA; HUM; 4 682 BP. 
XX 

AC X05608; S42443; 
XX 

NI el002618 

XX . . . 

With 1 enzymes : NOTI 

October 31, 1996 14:33 

aaggatccaagtgtcacggggtctgggcaatgcaggacgggaggggctgcgtgagtgagt 

1 + + + + + + 60 

ttcctaggttcacagtgccccagacccgt tacgtcctgccctccccgacgcactcactca 

a KDPSVTGSGQCRTGGAA * V S 

b RIQVSRGLGNAGREG LRE * V 

C GSKCHGVWAMQDGRGCVSEY- 

acagaagggaaatgagtgagggggcatgggatctcagagaaaatcagggacctctgagca 

61 + + + + + + 120 

tgtct tccctt tact cactcccccgtaccctagagtctct 1 1 tag tccctggagactcgt 

a TEGK*VRGHG-ISEKIRDL*A 

b QKGNE* GGMGSQRKSGTSEQ 

c RREMS EGAWDLRENQG P LS K - 

aagtggaaaggacgaccgccgcagctcctcgggccgtagctcgaccccgcct tccctt tt 

121 + + + + + + 180 

ttcacctttcctgctggcggcgtcgaggagcccggcatcgagctggggcggaagggaaaa 

a KW KGRPPQLLGP * LD PAFPF 

b SGKDDRRSSSGRSSTPPSLF 

c VERTTAAAPRAVAR PRL PFS- 

ccgcagaatcctcgccttggctgcagcagcgcgctgcccccactggccggcgtgccgtga 

181 + + + + + + 240 

ggcgtcttaggagcggaaccgacgtcgtcgcgcgacgggggtgaccggccgcacggcact 

a PQNPRLGCSSALPPLAGVP* 

b RRILALAAAARCPHWPACRD 

c AESS PWLQQRAAPTGRRAVI- 
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Figure 7-2 



tcgatcgcaggctgcgtcaggacctcccggcgtataaataggggtggcagaacggcgccg 

241 + + + + + + 300 

agctagcgtccgacgcagtcctggagggccgcatatttatccccaccgtcttgccgcggc 

a SIAGCVRTSRRINRGGRTAP 

b RSQAASGPPGV* IGVAERRR- 

c DRRLRQDLPAYK* GWQN GAE- 

agccgcacacagccatccatcctcccccttccctctctcccctgtcctctctctccgggc 

301 + + + + + + 360 

tcggcgtgtgtcggtaggtaggagggggaagggagagaggggacaggagagagaggcccg 

a SRTQPSILPLPSLPCPLSPG 

b AAHSHPSSPFPLSPVLSLRA- 

C PHTAIHPPPSLSPLSSLSGL- 

tcccaccgccgccggggagcaccggccgccaaccaatgagttccttcagctacgagccgt 

361 + + + + + + 420 

agggtggcggcggcccctcgtggccggcggttggttacccaaggaagtcgatgcticggca 

a SHRRRGAPAANQ*VPSATSR 

b PTAAGEHRP PTNEFLQLRAV- 

C PPPPGSTGRQPMSS FSYEPY- 

actactcgacctcctacaagcggcgctacgtggagacgccccgggtgcatatcagcgtgc 

421 + + + + + + 480 

tgatgagctggaggatgttcgccgcgatgcacctctgcggggcccacgtatagtcgcacg 

a TTRPPTSGATWRRPGCI SAC 

b LLDLLQAALRGDAPGAYQRA- 

c YSTSYKRRYVETPRVH I SVR- 

gcagcggctacagcaccgcacgctcagcttactcaagctactcggcgccggtgtcttcct 

481 + + + + « + + 540 

cgtcgccgatgtcgtggcgtgcgagtcgaatgagttcgatgagccgcggccacagaagga 

a AAATAPHAQLTQATRRRCLP 

b QRLQHRTLSLLKLLGAGVFL 

c SGYSTARSAYSSYSAPVSSS- 

cgctgtccgtgcgccgcagctactcctccagctctggatcgttgatgcccagtctggaga 

541 + + + + + + 600 

gcgacaggcacgcggcgtcgatgaggaggtcgagacctagcaactacgggtcagacctct 
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Figure 7-3 

a RCPCAAATPPALDR*CPVWR 

b AVRAPQLLLQLWIVDAQSGE 

c LSVRRSYSSSSGSLMPSLEN- 

acctcgacctgagccaggtagccgccatcagcaacgacctcaagtccatccgcacgcagg 
601 + + + + + + 66Q 

tggagctggactcggtccatcggcggtagtcgttgctggagttcaggtaggcgtgcgtcc 

a TST*AR*PPSATTSSPSARR 

b PRPEPGSRHQQRPQVHPHAG- 

c LDLSQVAAI SND LKS I RTQE- 

agaaggcgcagctccaggacctcaatgaccgcttcgccagcttcatcgagcgcgtgcacg 
661 + + + + + + 720 

tcttccgcgccgaggtcctggagttactggcgaagcggtcgaagtagctcgcgcacgtgc 

a RRRSSRTSMTASPASSSACT 

b EGAAPGPQ* PLRQLHRARAR 

C KAQLQDLNDRFAS F I ERVHE- 

agctggagcagcagaacaaggtcctggaagccgagctgctggtgctgcgccagaagcact 

721 + + 4- + + + 780 

tcgacctcgtcgtcttgttccaggaccttcggctcgacgaccacgacgcggtcttcgtga 

a SWSSRTRSWKPSCWCCARST 

b AGAAEQGPGSRAAGAAPEAL 

C LEQQNKVLEAELLVLRQKH S- 

ccgagccaccccgcttccgggcgctgtacgagcaggagatccgcgacctgcgcctagcgg 
781 + + + + + + 340 

ggctcggtagggcgaaggcccgcgacatgctcgtcctctaggcgctggacgcggatcgcc 

a PSHPASGRCTSRRSATCA* R 

b RAI PLPGAVRAGDP RPAPSG- 

c E PS R FRALYEQE I RDLRLAA- 

cggaagatgccaccaccaacgagaagcaagcgctccgaggcgagcgcgaagaagggctgg 
841 + + + + + + 900 

gccttctacggtggtggttgctcttcgttcgcgaggctccgctcgcgcttcttcccgacc 

a RKMPPPTRSKRS EASAKKGW 

b GRCHHQREASAPRRARRRAG 

C EDATTNEKQAL RG E R E EGLE- 
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Figure 7-4 

aggagaccctgcgcaacctgcaggcgcgctatgaagaggaggtgctgagccgcgaggacg 

901 + + + + + -f 960 

tcctctgggacgcgttggacgtccgcgcgatacttctcctccacgactcggcgctcctgc 

a RRPCATCRRAMKRRC*AART 

b GDPAQPAGAL*RGGAEPRGR 

c ETLRNLQARYEE EVLS R E D A - 

ccgagggccggctgatggaacgccgcaaaggcgccgacgaggcggcgctcgctcgcgccg 

961 + + + + + + 1020 

ggctcccggccgactaccttgcggcgtttccgcggctgctccgccgcgagcgagcgcggc 

a PRAG * WNAAKAPTRRRS LA P 

b RGPADGTPQRRRRGGARSRR 

c EGRLMERRKGADEAALARAE- 

agctcgagaagcgcatcgacagcttgatggacgaaatctcttttctgaagaaagtgcacg 

1021 + + + + + + 1080 

tcgagctcttcgcgtagctgtcgaactacctgctttagagaaaagacttctctcacgtgc 

a SSRSASTA* WTKSLF* R KCT 

b AREAHRQL»DGRNLFSEESAR 

c LEKRIDSLMDEI S FLKKVHE- 

aagaggagatcgccgaactgcaggcgcagatccagtacgcgcagatctccgtggagatgg 

1081 + + + + + + 1140 

ttctcctctagcggcttgacgtccgcgtctaggtcatgcgcgtctagaggcacctctacc 

a KRRSPNCRRRSSTRRS PWRW 

b RGDRRTAGADPVRADLRGDG 

c EEIAELiQAQIQYAQISVEMD- 

acgtgaccaagcccgacctttccgccgcgctcaaggacatccgcgcgcagtacgagaagc 

1141 + + + + + + 1200 

tgcactggttcgggctggaaaggcggcgcgagttcctgtaggcgcgcgtcatgctcttcg 

a T*PSPTFPPRSRTSARSTRS 

b RDQARPFRRAQGHPRAVREA 

c VTKPDLSAALKD I RAQYE K L - 

tggccgccaagaacatgcagaacgctgaggaatggttcaagagccgcttcacggtgctga 

1201 + + + + + + 1260 

accggcggttcttgtacgtcttgcgactccttaccaagttctcggcgaagtgccacgact 
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Figure 7-5 

a WPPRTCRTLRNGSRAASR C* 

b GRQEHAER* GMVQEPLHGAD 

c AAKNMQNAE EW F KS R FTVLT- 

ccgagagcgccgccaagaacaccgacgccgtgcgcgccgccaaggacgaggtgtcggaga 

1261 + + + + + + 1320 

ggctctcgcggcggttcttgtggctgcggcacgcgcggcggttcctgctccacagcctct 

a PRAPPRTPTPCAPPRTRCRR 

b RERRQEHRRRARRQGRGVGE- 

c ESAAKNTDAVRAAKDEVSES- 

gccgtcgtctgctcaaggccaagaccctggaaatcgaagcatgccggggcatgaatgaag 

1321 + + + + + + 1380 

cggcagcagacgagt tccggttctgggaccttt age tt cgt acggccccg tact tact tc 

a AVVCSRPRPWKSKHAGA*MK 

b PSSAQG.QDPGNRSMPGHE*S 

c RRLLKAKTLEI EACRGMNEA- 

cgctggagaagcagctgcaggagctggaggacaagcagaacgccgacatcagcgctatgc 

1381 + + + + + + 1440 

gcgacctcttcgtcgacgtcctcgacctcctgttcgtcttgcggctgtagtcgcgatacg 

a RWRSSCRSWRTSRTPTSALC 

b AGEAAAGAGGQAERRHQRYA- 

c LEKQLQELEDKQNADI SAMQ- 

sggtgcggcacggccagaaacacaggggggcggggaactcgagcaagggggggagttggt 

1441 + + + + + + 1500 

tccacgccgtgccggtctttgtgtccccccgccccttgagctcgttccccccctcaacca 

a RCGTARNTGGRGTRARGGVG 

b GAARPETQGGGELEQGGELV- 

c VRHGQKHRGAGNSS KGG SWC- 

gcgcccagaaagcgaaaccaggggtggtgcggctgcccagctcttagggatagggcttgg 

1501 + + + + + + 1560 

cgcgggtctttcgctttggtccccaccacgccgacgggtcgagaatccctatcccgaacc 

a APRKRNQGWCGCPALRDRAW 

b RPESETRGGAAAQLLG I GLG 

c AQKAKPGVVRL PSS * G * GLA- 
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Figure 7.5 

ctccttggccactgtgtggaggggtggggctcttgaggggcgtgagtgcgggcgccactg 

1561 + + + + + + 1620 

gaggaaccggtgacacacctccccaccccgagaactccccgcactcacgcccgcggtgac 

a LLGHCVEGWGS * G A * VRAPL 

b SLAT'VWRGGALEGRECGRHC 

c PWPLCGGVG LLRGV S A G. A T V - 

tagtccgggagtgactgctccgcgtgctgcaccggcgt tccgcattaaagctgcccgacc 

1621 --- + + + + + + 1680 

atcaggccctcactgacgaggcgcacgacgtggccgcaaggcgtaatttcgacgggctgg 

a *SGSDCSACCTGVPH*SCPT 

b S PGVTAP RAAPAFR I KAAR P 

c VRE * LLRVLHRRSALKLPDP- 

cttgttgggnggggaggggaagacgcgggaattgggcgttgcctccgactgcagtgagat 

1681 + + + + + + 1740 

gaacaacccacccctccccttctgcaccct taacccgcaacggaggctgacgtcactcta 

a LVGWGGEDVG IGRCLRLQ * D 

b LLGGEGKTWELGVASDCSEI- 

c CWVGRGRRGNWAL P PTAVRS- 

cagctctctactgacctgcgttgaccacgagttactttgcagcgactatcggatcgtcta 

1741 + + + + + + 1800 

gtcgagagatgactggacgcaactggtgctcaatgaaacgtcgctgatagcctagcagat 

a QLSTDLR* PRVTLQRLSDRL 

b SSLLTCVDHELLCSDYRIV* 

c ALY * PALTT S Y F A A T I G S S S - 

gttaataaatagtacgagtgaactaactctcaattaattctgaaggattactgtgaccag 

1801 . + + + + + + 1860 

caattatttatcatgctcacttgattgagagttaattaagacttcctaatgacactggtc 

a VNK*YE*TNSQLI LKDYCDQ 

b LINSTSELTLN* F * RITVTS 

c **IVRVN*LSINSEGLL*PA- 

catgctttatgactagttttaccaaccacctcccttcctttatttagtaggtagacagga 

1861 + + + + + + 1920 

gtacgaaatactgatcaaaatggttggtggagggaaggaaataaatcat create tgtcct 
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Figure 7-7 



a 
b 



HAL*LVLPTTSLPLFSR*TG 
MLYD* FYQPPPFLYLVGRQE- 
c CFMTSFTNHLPSFI * * V D R K - 

aaatagtcaacattgttttaggtagttaactagtgatgttcatagtaaaccatttccttt 
1921 + + + + + + 1980 

tttatcagttgtaacaaaatccatcaattgatcactacaagtatcatttggtaaaggaaa 

a K* STLF*VVN**CS**TISF 

b NSQHCFR*LTSDVHSKPFPF- 
IVKIVLGS* LVMFI VNHFLL- 

taccttttctttttctttttttctttatgtgcaaaatcttctacaacatttctgtttaaa 
1981 + + + + + 2040 

atggaaaaaaaaaagaaaaaaagaaatacacattttagaagatgttgtaaagacaaattt 



c 



a 



c 



YLFFFFFSLCVKSSTTFLFK 
b TFFFSFFLYV* NLLQHFCLN- 

C PFFFLFFFMCKI FYNI S V * T - 

catctcca:cttctggggagtagaaaaaatacaattttaaaaagatctccattttaaaac 
2041 + + + + + + 2100 

gtagaggcagaagacccctcatcttttttatgttaaaatttttctagaggtaaaattttg 

a HLHLLGSRKNTI LKRS P F * N 

b I S I FWGVEKIQF* KDLHFKT- 

SPSSGE*KKYNFKKI S I LKH- 

atctccgtcttccggggagtagaaatttttttcttcttctggggagtagaaaaaataatt 
2101 + + + + -_--.-- + -r 2160 

tagaggcagaagacccctcatctttaaaaaaagaagaagacccctcatcttttttattaa 

ISVFWGVEIFFFFWGVE KII 
b SPSSGE*KFFSSSGE*KK*F» 
C LRLLGSRNFFLLLG S RKNNL- 

tagatacacaggaaatatttcatagaaaataatttttttcttttttttgtttacatctgg 

2161 + + + + + + 2220 

atctatgtatcctttataaagtatcttttattaaaaaaagaaaaaaaacaaatgtagacc 

a *ihrk'yfiennfflffvyiw 

b RYIGNIS*KI IFFFFLFTSG- 

c D T * EI FHRK* FFSFFCLHLV- 



a 
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Figure 7-8 

tattttcttctcataaagaaaggcattagtttcctggcatgtaacccagctaaagaagag 

2221 + + + + + + 2280 

ataaaagaagagtatttctttccgtaatcaaaggaccgtacattgggtcgatttcttctc 

a YFLLIKKGI SFLACNPAKEE 

b I FFS*RKALVSWHVTQLKKS 

c FSSHKERH* FPGM* PS * R R V - 

taatcagtgaatgagagacacagtttttctatcaacttagtctgtttttctatcaactta 

2281 + + + + + + 2340 

attagtcacttactctctgtgtcaaaaagatagttgaatcagacaaaaagatagttgaat 

a *SVNERHSFSINLVCFSINL 

b NQ*MRDTVFLST* SVFLST* 

c ISE + ETQFFYQLSLF FYQLS- 

gtctgt ttgcatgcat t ttatgatgatcat taaacagtattaagtaaagaaacagaagaa 

2341 + + + + + + 2400 

cagacaaacgtacgtaaaatactactagtaatttgtcataattcatttctttgtcttctt 

a VCLHAFYDDH *TVLS KET EE 

b SVCMHFMMI IKQY*VKKQKN- 

c LFACIL* * S LNS I K* RNRRT- 

cagaattttcgtccatcttttttttcatctcaggcttcatgaacttgggtattttaggca 

2401 + + + +- + + 2460 

gtcttaaaagcaggtagaaaaaaaagtagagtccgaagtacttgaacccataaaatccgt 

a QNFRPSFFS. SQAS*TWVF*A 

.b RIFVHLFFHLRLHELGYFRH- 
<: EFSSIFFF I SGFMNLGI LGM- 

tgaaggtttttcaaaagatacaggaagttattctaggagagattttatcaaagtgtgcac 

2461 + + + + + + 2520 

acttccaaaaagttttctatgtccttcaataagatcctctctaaaatagtttcacacgtg 

a *RFFKRYRKLF* ERFYQSVH 

b EGFSKDTGSYSRRDFI KVCT- 

c KVFQKI QEV I LGE I LS KCAP- 

cttgattttaatcgaaactaggcctttgcaactacactacagtaaaataatagaagggat 

2521 + + + + + + 2580 

gaactaaaattagctttgatccggaaacgttgatgtgatgtcattttattatcttcccta 
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Figure 7-9 

a LDFNRN*AFATTLQ*NNRRD 

b LILIETRPLQLHYSKI IEGI 

c * F * SKLGLCNYTTVK* * KGF- 

ttatgctcggartttttttttgttttatttttgtcttcaaacaggacacgatcaacaaat 
2581 + + + + + -+- 2640 

aatacgagcccaaaaaaaaaacaaaataaaaacagaagtttgtcctgtgctagttgttta 

a LCSDFFFVLFLSSNRTRSTN 

b YARIFFLFYFCLQTGHDQQI 

C MLGFFFCFI FVFKQDT I N K L - 



tagaaaatgaarcgaggaccacaaagagtgaaatggcacgatacctaaaagaataccaag 

— — — -|_ _ — — _ ' + — ~ -* — — — 4- 

atctttcactcaactcctggtgtttctcactttaccgtgctatggattttcttatggttc 



a *KMN*GPQRV'KWHDT*KNTK 

b R K * I EDHKE*NGTI PKR I PR 

c E N E LRTTKS EMARYLK E YQD- 

acctccccaacgtgaagatggctttggatattgagattgctgcttacaggtgaaaataga 
2701 + + + + + + 2760 

tggaggagttgcacttctaccgaaacctataactctaacgacgaatgtccacttttatct 

a TSST*RWLWIljRLL.LTGENR 

b PPQRED GFGY* DCCLQVKIE- 

c LLNVKMALD I E IAAYR * K * R - 

ggggcaaagacagcagccattaaaccttaggaagaaaatcagatcccatttaaagttatg 
2761 + + + + + + 2820 

ccccgtttctgncgtcggtaatttggaatccttcttttagtctagggtaaatttcaatac 

a GA KTAA I KP * E ENQ I p F KVM 

b GQRQQPLNLRKKI RSHLKLC 

c GKDSSH*TLGRKSDPI * SYV- 

ttggatcagaaaccttcaataatagtccttttgaaataatgaagtgttagtttttggctt 
2821 + + + ■+ + + 2880 

aacctagtctr-ggaagttattatcaggaaaactttattacttcacaatcaaaaaccgaa 

a ^DQKPSIIVLLK**SVSFWL 

b WIRXLQ**SF*NNEVLVFGF- 

c GSETFNNSPFEIMKC* FLAS- 



WO 98/45322 PCT/IB98/00705 

46/169 

Figure 7-10 

cttccaagaagagggtatttagatatataagaatttaaccctgtaattagagtcctgttt 

2881 + + + + + + 2940 

gaaggttcttctcccataaatctatatattcttaaattgggacattaatctcaggacaaa 

a LPRRGYLDI*EFlSrPVIRVLF 

b FQEEGI* IYKNLTL*LESCF- 

c SKKRVFRYIRI*PCN*SPVF- 

ttatcttgtcattacactttaaatctaataggagtgatttatttatattttttctggtct 

2941 + + + + + + 3000 

aatagaacagtaatgtgaaatttagattatcctcactaaataaatataaaaaagaccaga 

a LSCHYTLNL IGVI YLYFFWS 

b YLVITL*I**E*FIYIFSGL- 

c ILSbHFKSNRSDLFI FFLVS- 

ccatcaaaagatccccaggcattaagtattgataaatcccagccctgctcctgcttgcct 

3001 + + + + + - - + 3060 

ggtagt tttct aggggtccgtaat teat aac tat ttagggtcgggacgaggacgaacgga 

a PSKDPQALSIDKSQPCSCLP 

b HQKIPRH*VLINPSPAPACL- 

c IKRSPGI K Y * * I PALL L LA F- 

ttgtgt ttagggtactcagagcaagt tgtgaaacacaggtgt t t t ttaaccccaccttgc 

3061 + + + + + + 3120 

aacacaaatcccatgagtctcgttcaacactt tgtgtccacaaaaaattggagtggaacg 

a LCLGYSEQVVKHRCFLTS PC 

b CV*GTQSKL*NTGVF* P H L A - 

c V FRVLRASCETQVFFN LTLH- 

acctgcatccccaggaaactcttggaaggcgaggagacccgactcagtttcaccagcgtg 

3121 + + + + + + 3180 

tggacgtaggggtcctttgagaaccttccgctcctctgggctgagtcaaagtggtcgcac 

a TCI PRKLLEGEETRLSFTSV 

b PASPGNSWKARRPDSVSPAW- 

C LHPQETLGRRGDPTQFHQRG- 

ggaagcataaccagtggctactcccagagctcccaggtctttggccgatctgcctacggc 

3181 + + + + + + 3240 

ccttcgtattggtcaccgatgagggtctcgagggtccagaaaccggctagacggatgccg 
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Figure 7-11 

a GSITSGYSQSSQVFGRSAYG 

b EA* PVATPRAPRSLADLPTA- 

C KHNQWLLPELPGLWPICLRR- 

ggtttacagaccagctcctatctgatgtccacccgctccttcccgtcctactacaccagc 
3241 + + + + + + 3300 

ccaaatgtctggtcgaggatagactacaggtgggcgaggaagggcaggatgatgtggtcg 

a GLQT SSYLMSTRSFPSYYTS 

b VYRPAPI*CPPAPSRPTTPA- 

C FTDQLLSDVHPLLPVLLHQP- 

catgtccaagaggagcagaccgaagtggaggaaaccattgaggcgtctaaggctgaggaa 
3301 + + + + + + 3360 

gtacaggttctcctcgtctggcttcacctcctttggtaactccgcagattccgactcctt 

a HVQEEQTEVEETIEASKAEE 

b MSKRSRPKWRKPLRRLRLRK- 

C CPRGADRSGGNH * G V * G * G S - 

gccaaggatgagcccccctctgaaggagaagccgaggaggaggagaaggacaaggaagag 

3361 + + + + + + 3420 

cggt t cctactcggggggagacttcctct tcggctcct cctcctct tcctgttccttctc 

a AKDEPPSEGEAEEEEKDKEE 

b PRMSPPLKEKPRRRRRTRKR- 

c QG*AP L* RRSRGGGEGQGRG- 

gccgaggaagaggaggcagctgaagaggaagaaggtatgataagaaaaaacccctgcaac 
3421 + + + + + + 3480 

cggctccttctcctccgtcgacttctccttcttccatactattcttttttggggacgttg 

a AEEEEAAEEEEGMI RKNPCN 

b PRKRRQLKRKKV* *EKTPAT- 

C RGRGGS * RGRRYDKKKPLQL- 

ttcaagtigtaaactgggtgtggagatttgttaggaggtggataagacaaatgaagccttg 

3481 + + + + + + 3540 

aagttcacatttgacccacacctctaaacaatcctccacctattctgtttacttcggaac 

a FKCKLGVEIC * EVDKTNEAL 

b SSVNWVWRFVRRWIRQMKPC - 

c Q V * TGCGDLLGGG * D K * SLA- 
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Figure 7-12 

ctcatttattcatatatgacattagaatcataaataaattttctgtttgtttagcaaaac 

3541 + + + + + + 3600 

gagtaaataagtatatactgtaatcttagtatttatttaaaagacaaacaaatcgttttg 

a LIYSYMTLES* INFLFV*QN 

b SFIHI*H*NHK* IFCLFSKT- 

c HLFIYDIRI INKFSVCL AKL- 

tttcctaaggcatctactctgaatgaggtgattggtcaaaattttcattttttaatataa 
3601 + + + + + + 3660 

aaaggattccgtagatgagacttaccccactaaccagttttaaaagtaaaaaattatatt 

a FPKASTLNEVIGQNFHFLI* 

b FLRHLL*MR * L V ■ K I FI F * YN 

c S*GIYSE*GDWSKFSFFNII- 

tcatttaacacagcaggttggtgtcctaaagaacaaaaatagataccagacacataatga 

3661 + + + + + + 3720 

agtaaattgtgtcgtccaaccacaggatttcttgtttttatctatggtctgtgtattact 

a SFNTAGWCPKEQK* I PDT* * 

b HLTQQVGVLKNKNRYQTHNE- 

c I * H S R L V S * RTKIDTRH I M K - 

aagaaatattgaggttaagtcttggagaggagcagagcttcccatacctagaagtgatct 

3721 + + + + + -r 3780 

ttctttataactccaattcagaacctctcctcgtctcgaagggtatggatcttcactaga 

a KKY*G*VLERSRASH-T*K* S 

b RNIEVKSWRGAELPI PRSDL 

P EILRLSLGEEQSFPYLEVI S- 

cattcgatttaaatatgtgttcagtggcaaattattcatggcaagctttgtctgttacat 

3781 +■ + + + + + 3840 

gtaagctaaatttatacacaagtcaccgtttaataagtaccgttcgaaacagacaatgta 

a HSI*ICVQWQIIHGKLCLLH 

b I RFKYVFSGKLFMASFVCYM 

C FDL NMC SVANY SWQALSVTC- 

gtgcttttggagagagtggagctgggaggttttggtagcattctgacagttgtgtttgca 
3841 + + + + _ + + 3900 

cacgaaaacctctctcacctcgaccctccaaaaccatcgtaagactgtcaacacaaacgt 
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Figure 7-13 

a VLLERVELGGFGS I LTVVFA 

b CFWREWSWEVLVAF* QLCLQ- 

C AFGESGAGRFW * HSDS CVCK- 

aataaaacctttgcagacatgttttgactggacttaccctggatttgcattttgtacatt 

3901 + + + + + + 3960 

ttattttggaaacgtctgtacaaaactgacctgaatgggacctaaacgtaaaacatgtaa 

a NKTFADMF* LDLPWI CI LYI 

b I KPLQTCFDWTYPGFAFCTF- 

c *NLCRHVLTGLTLD-LHFVHF- 

ttctttttatgttaaagctgccaaggaagagtctgaagaagcaaaagaagaagaagaagg 

3961 + + + + + + 4020 

aagaaaaatacaatttcgacggttccttctcagacttcttcgttttcttcttcttcttcc 

a FFLC * SCQGRV* RSKRRRRR 

b SFYVKAAKEESEEAKEEEEG- 

c LFMLKLPRKSLKKQKKKKKE- 

aggtgaaggtgaagaaggagaggaaaccaaagaagctgaagaggaggagaagaaagttga 

4021 + + + + + + 4080 

tccact tccacttct tcctctcct ttggtttct tcgacttctcctcctcttctt tcaact 

a R*R*RRRGNQRS *RGGEES* 

b GEGEEGEETKEAEEE EKKVE 

C VKVKKERKP KKLKR RRRKL K - 

aggtgctggggaggaacaagcagctaagaagaaagactgaacccccatttccttaattat 

4081 + + + + + + 4140 

tccacgacccctccttgttcgtcgattcttctttctaacttgggggtaaaggaattaata 

a RC.WGGTSS* EERLNPHF LNY 

b GAGEEQAAKKKD*TPISLI I 

c VLGRNKQLRRKIEP PFP * L F - 

ttcaggaataattctcccgaaatcaggtcaaccccatcaccaaccaaccaaccagttgag 

4141 + + + + + + 4200 

aagtccttattaagagggctttagtccagttggggtagtggttggttggttggtcaactc 

a FRNNSPEIRSTPSPTNQPVE 

b SGI ILPKSGQPHHQPTNQLS 

c Q E * FSRNQVNP ITNQPTS * V- 
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Figure 7-14 

ttccagattctatgtgaattaaaaagtcaatatatgtataattctgagatgacttaggtt 



4201 + + + + + + 4260 

aaggtctaagatacacttaatttttcagttatatacatattaagactctactgaatccaa 

a FQILCELKSQYMYNSEMT*V 

b SRFYVN*KVNICIILR*LRL- 

c PDSM*IKKSIYV*F*DD- LGW- 

ggacattcaatgttgtgctatgaatttcctctttatgcagagtatctgtttgcttgcaga 

4261 + + + + + + 4320 

cctgtaagttacaacacgatacttaaaggagaaatacgtctcatagacaaacgaacgtct 

a GHSMLCYEFPLYAEYLFACR 

b DIQCCAMNFLFMQSICLLAE- 

c TFNVVL*ISSLCRVSVCLQS- 

gtggctttcggcttgctgccagcctgtgcatggtccacgcttatgagttcaggatctacg 

4321 + + + + + + 4380 

caccgaaagccgaacgacggtcggacacgtaccaggtgcgaatactcaagtcctagatgc 

a VAFGLLPACAWSTLMS SGST 

b WLSACCQPVHGPRL.*VQDLjR- 

c G FRLAAS LCMVHAY EFR I Y G - 

gcaatgtgaatcattcagatgtttacaataaaaaacaccacatgagtaaatgaattcact 

4381 + + + + + + 4440 

cgttacacttagtaagtctacaaatgttattttttgtggtgtactcatttacttaagtga 

a AM* I IQMFTI KNTT*VNEFT 

b. QCESFRCLQ* KTPHE*MNSL- 

c- NVNHSDVYNKKHHMSK* I H * - 

aatgttaatgttaaacttcatggaaaagtagtcctttgaaccttcggtggttagcaatta 

44 41 + + + + + + 4500 

ttacaattacaatttgaagtaccttttcatcaggaaacttggaagccaccaatcgttaat 

a NVNVKIjHGKVVL*TFGG*QL 

b MLMLNFMEK*SFEPSVVSN* 

c C*C*TSWKSSPLNLRWLAIK- 

aagaccctgagttatgtgcaataaatagtaaataaagttataccgaatgatgtatttttt 

4501 + + + + + + 4560 

ttctgggactcaatacacgttatttatcatttatttcaatatggcttactacataaaaaa 
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Figure 7-15 

a KTLSYVQ* IVNKVI PNDVFF 

b RP*VMCNK* * IKLYRMMYFL- 

c DPELCAINSK*SYTE*CIFC- 

gccgtggttgttacctaattaaaataccttaaagatggcaccaatataaagtgtgtgcca 

4561 + + + + + + 4620 

cggcaccaacaatggattaattttatggaatttctaccgtggttatatttcacacacggt 

a AVVVT*LiKYLKDGTNI KCVP 

b PWLLPN*NTLKMAPI*SVCQ- 

C RGCYLI KI P * RWHQYKVCAS- 

gtgaactattgacctccaattttttaaaaagccgaaattttaacaattaccaatactttt 

4621 + + + + + + 4680 

cacttgacaactggaggttaaaaaatttttcggctttaaaattgt taatggt tatgaaaa 

a VNY*PPIF*KAEILTI TNTF 

b *TIDLQ-FFKKPKF*QLPILF- 

C ELLTSNFLKSRNFNNYQYFF- 

tt 

4681 4682 
aa 

a 
b 
c 

Enzymes that do cut : 
NONE 

Enzymes that do not cut : 
Not I 



WO 98/45322 



52/169 



PCT/IB98/00705 



Figure 8-1 Neurofilament M 

(Linear) MAP of: hsnfm.gcg check: 3606 from: 1 to: 6236 

ID HSNFM standard; DNA; HUM; 62 3 6 BP. 
XX 

AC Y00067; 
XX 

NI g35045 

XX . . . 

With 1 enzymes: NOTI 

October 31, 1996 14:29 

cage tgc 1 1 1 aagacaagggg t gggggaaggggagggaggcaagaaaaga tgaggg tggg 
I + + + + + + 60 

gtcgacgaaattctgttccccaccccct t cccctccct ccgttct ttt ctactcccaccc 

a QLL*DKGW GKGREARKDEGG 

b SCFKTRGGGRGGRQEKMRVG 

c AALRQGVGEGEGGKKR * GWG- 

ggaggggaaaagagggaatgcaaggggaaggagggaggagacggggagaaggaaagattg 
61 + + + + + + 12 o 

cctccccttttctcccttacgttccccttcctccctcctctgcccctcttcctttctaac 

a GGEKRECKGKEGGDGEKERL 

b EGKRGNARGRREETGRRKDW 

c RGKEGMQGEGGRRRGEGKIG- 

gaagaaaaggatctccgaggaaggggctgagagaagggcagggtgaactggactaaaggc 

121 + + + + + + 180 

cttcttttcctagaggctccttccccgactctcttcccgtcccacttgacctgatttccg 

a EEKDLRGRG * EKGRVNWTKG 

b KKRI SEEGAERRAG* TGLKA 

c RKGSPRKGLREGQGELD* RP- 

cagagtaggaaggagaagaggggccaaaaaagaaggggatgaaattaagcacagaagatg 

181 + + + 4- + + 240 

gtctcatccttcctcttctccccggttttttcttcccctactttaattcgtgtcttctac 

a Q S R KE KRGQ KRRG * N * AQ KM 

b RVGRRRGAKKEGDE I KHRRW 

c E*EGEEGPKKKGMKLSTEDG- 
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Figure 8-2 

ggtaaagaaaaaagtatcagggaaagggcaaaataagagaaagccttgaggataagaggg 

241 + + + + + + 300 

ccatttcttttttcatagtccctttcccgttttattctctttcggaactcctattctccc 

a GKEKSIRERAK* EK ALRIRG 

b VKKKVSGKGQNKRKP* G * E G - 

C * RKKYQGKGKI RE S LED KRV- 

tagaaggctaaagaacaaggggaccacggggtcggggaagcgctgcctgaacggcgggac 

301 + + + + + + 360 

atcttccgatttcttgttcccctggtgccccagccccttcgcgacggacttgccgccctg 

a * KAKEQGDHGVGEALPERRD 

b RRLKNKGTTGSGKRCLNGGT 

c EG * RTRGPRGRGSAA* TAGQ- 

agtgacaaaagaaagggcgctggcgatattccgaccaagggaaacgcaatcgggaggtga 

361 + - + +* + + + 420 

tcactgttttctttcccgcgaccgctataaggctggttccctttgcgttagccctccact 

a SDKRKGAGDI PTKGNAI G R * 

b VTKERALAIFRPRETQSGGE- 

C * QKKGRWRYSDQGKRNREVR- 

gaaatcgggaggtgagaaatggaaagaaggcgaatccgcggctacaagtagcctgggact 

421 + + + - - + + + 480 

ctttagccctccactctttacctttcttccgcttaggcgccgatgttcatcggaccctga 

a EIGR*EMERRRIRGYK *PGT 

b KSGGEKWKEGESAATSS LGL- 

c NREVRNGKKANPRLQVAWD*- 

gaaaggggacctgggggaggggctgggcccagggcagaaaagtccaggttcccatgcggc 
481 + + + + + + 540 

ctttcccctggaccccctccccgacccgggtcccgtcttttcaggtccaagggtacgccg 

a ERGPGGGAGPRAEKSRFPCG 

b KGDLGEGLGPGQKS PGSHAA 

C KGTWGRGWAQGRKVQVPMRP- 

ctgggcccacgtggagcgggcgctgaatcaccgttcagccgcccccctcccctcctcccc 

541 + + + + + + 600 

gacccgggtgcacctcgcccgcgacttagtggcaagtcggcggggggaggggaggagggg 
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Figure 8-3 

a LGPRGAGAES PFSRP PPLLP 

b WAHVERALNHRSAAPLPSSP- 

C GPTWSGR*ITVQPPPSPPPR- 

gaccggtgcccgcagtccccgcctcctcggccgccgcctccacggggcgggccctggccc 
601 + + + + + + 660 

ctggccacgggcgtcaggggcggaggagccggcggcggaggtgccccgcccgggaccggg 

a DRCPQSPPPRPPPPRGGPWP 

b, TGARSPRLLGRRLHGAGPGP 

C. PVPAVPAS S AAAS TGRALAR- 



gggaccagcgccgcggctataaatgggctgcggcgaggccggcagaacgctgtgacagcc 

4 + h H ^ + 

ccctggtcgcggcgccgatatttacccgacgccgctccggccgtcttgcgacactgtcgg 



a GTSAAAINGLRRGRQNAVTA 

b GPAPRL*MGCGEAGRTL*QP 

c DQRRGYKWAAARPAERCDS H - 

acacgccccaaggcctccaagatgagctacacgttggactcgctgggcaacccgtccgcc 
721 + + + + + + 780 

tgtgcggggttccggaggttctactcgatgtgcaacctgagcgacccgttgggcaggcgg 

a TRPKASKMSYTLDSLGNPSA 

b HAPRPPR*ATRWTRWATRPP 

c TPQGLQDELHVGLAGQPVRL- 

taccggcgggtaaccgagacccgctcgagcttcagccgcgtcagcggctccccgtccagt 
781 + + + + + + 840 

at 99 c cgGCcattggctctgggcgagctcgaagtcggcgcagtcgccgaggggcaggtca 

a YRRVTETRSSFSRVSGSPSS 

b TGG* PRPARASAASAAPRPV- 

c PAGNRDPLELQ PRQRLPVQW- 

ggcttccgctcgcagtcgtggtcccgcggctcgcccagcaccgtgtcctcctcctataag 
841 + + + + + + 900 

ccgaaggcgagcgtcagcaccagggcgccgagcgggtcgtggcacaggaggaggatattc 

a GFRSQSWSRGSPSTVSSSYK 

b ASARSRGPAARPAPCPPPIS- 

C LPLAVVVPRLAQHRVLLL* A- 
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Figure 8-4 



cgcagcatgctcgccccgcgcctcgcttacagctcggccatgctcagctccgccgagagc 

901 + + + + + + 960 

gcgtcgtacgagcggggcgcggagcgaatgtcgagccggtacgagtcgaggcggctctcg 

a RSMLAPRLAYSSAML SSAES 

b AACSPRASLTARPCSAPPRA 

c Q HARPAPRLQLGHAQLR REQ- 

agccttgacttcagccagtcctcgtccctgctcaacggcggctccggacccggcggcgac 

961 + + + + + + 1020 

tcggaactgaagtcggtcaggagcagggacgagttgccgccgaggcctgggccgccgctg 

a SLDFSQSSSLLNGGSGPGGD 

b ALTSASPRPCSTAAPDPAAT 

c P * LQPVLVPAQRRLRTRRRL- 

tacaagctgtcccgctccaacgagaaggagcagctgcaggggctgaacgaccgctttgcc 

1021 + + + + + + 1080 

atgttcgacagggcgaggttgctcttcctcgtcgacgtccccgacttgctggcgaaacgg 

a YKLS RSNEKEQLQG LNDR F A 

b TSCPAPTRRSSCRG * TTALP 

c QAVPLQREGAAAGAERPL,CR- 

ggctacatagagaaggtgcactacctggagcagcagaataaggagattgaggcggagatc 

1081 + + + + + + 1140 

ccgatgtatctcttccacgtgatggacctcgtcgtcttattcctctaactccgcctctag 

a G Y I EKVHYLEQQNKE I EAE I 

b AT*RRCTTWSSRIRRLRRRS 

c LHREGALPGAAE * GD*GGDP- 

caggcgctgcggcagaagcaggcctcgcacgcccagctgggcgacgcgtacgaccaggag 

1141 + + + + + + 1200 

gtccgcgacgccgtcttcgtccggagcgtgcgggtcgacccgctgcgcatgctggtcctc 

a QALRQKQASHAQLGDAYDQE 

b RRCGRSR PRTPSWATRTTRR 

c GAAAEAGLARPAGRRVR PGD- 

atccgcgagctgcgcgccaccctggagatggtgaaccacgagaaggctcaggtgcagctg 

1201 + + + + + + 1260 

taggcgctcgacgcgcggtgggacctctaccacttggtgctcttccgagtccacgtcgac 
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Figure 8-5 

a IRELRATLEMVNHEKAQVQL 

b SASCAPPWRW*TTRRLRCSW- 

c PRAARHPGDGEPREGSGAAG- 

gactcggaccacctggaggaagacatccaccggctcaaggagcgctttgaggaggaggcg 

1261 + + + + + + 1320 

ctgagcctggtggacctccttctgtaggtggccgagttcctcgcgaaactcctcctccgc 

a DSDHLEEDI HRLKERFEEEA 

b TRTTWRKTSTGSRSALRRRR 

c LGPPGGRHPPAQGAL * G G G A - 

cggttgcgggacgacactgaggcggccatccgggcgctgcgcaaagacatcgaggaggcg 

1321 + + + + + + 1380 

gccaacgccctgctgtgactccgccggtaggcccgcgacgcgtttctgtagctcctccgc 

a RLRDDTEAAI RALRKD I EEA 

b GCGTTLRRPSGRCAKTSRRR- 

c VAGRH * GGHPGAAQRHRGGV- 

tcgctggtcaaggtggagctggacaagaaggtgcagtcgctgcaggatgaggtggccttc 

1381 + + + + + + 1440 

agcgaccagttccacctcgacctgttcttccacgtcagcgacgtcctactccaccggaag 

a SLVKVELDKKVQS LQDEVAF 

b RWSRWSWTRRCSRCRMRWPS- 

c AGQGGAGQEGAVAAG * GGL P - 

ctgcggagcaaccacgaggaggaggtggccgaccttctggcccagatccaggcatcgcac 

1441 + + + + --- + + 1500 

gacgcctcgttggtgctcctcctccaccggctggaagaccgggtctaggtccgtagcgtg 

a LRSNHEEEVADLLAQ I QASH 

b CGATTRRRWPTFWPRSRHRT 

c AEQPRGGGGRPSGPDPG I A H - 

atcacggtggagcgcaaagactacctgaagacagacatctcgacggcgctgaaggaaatc 

1501 + + + + + + 1560 

tagtgccacctcgcgtttctgatggacttctgtctgtagagctgccgcgacttcctttag 

a ITVERKDYLKTD I STALKE I 

b SRWSAKTT*RQTSRRR*RKS- 

c HGGAQRLPEDRHLDGAEGNP- 



* it 
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Figure 8-6 

cgctcccagctcgaaagccactcagaccagaatatgcaccaggccgaagagtggttcaaa 

1561 + + + + + + 1620 

gcgagggtcgagctttcggtgagtctggtcttatacgtggtccggcttctcaccaagttt 

a RSQLESHSDQNMHQAEEWFK 

b APSSKATQTRICTRPKSGSN- 

c LPARKPLRPEYAPGRRV VQM- 

tgccgctacgccaagctcaccgaggcggccgagcagaacaaggaggccatccgctccgcc 

1621 + + + + + + 1680 

acggcgatgcggttcgagtggctccgccggctcgtcttgttcctccggtaggcgaggcgg 

a CRYAKLTEAAEQNKEA I RSA 

b AATPSSPRRPSRTRRPSAPP- 

C PLRQAHRGGRAEQGGHPLRQ- 

aaggaagagatcgccgagtaccggcgccagctgcagtccaagagcatcgagctagagtcg 

1681 + + + + + + 1740 

ttccttctctagcggctcatggccgcggtcgacgtcaggttctcgt age tcgatct cage 

a KEEIAEYRRQLQSKS I ELES 

b RKRSPSTGAS-CSPRASS*SR- 

c GRDRRV PAPAAVQ EHRARVG- 

gtgcgcggcaccaaggagtccctggagcggcagctcagcgacatcgaggagcgccacaac 

1741 + + + + + + 1800 

cacgcgccgtggttcctcagggacctcgccgtcgagtcgctgtagctcctcgcggtgttg 

a VRGTKESLERQLSDI EERHN 

b CAAPRSPWSGSSATSRSATT- 

c ARHQGVPGAAAQRHRGAPQP- 

cacgacctcagcagctaccaggtaggaaccgcggcctcggccagcctcggccacggccac 

1801 + + + + + + 1860 

gtgctggagtcgtcgatggtccatccttggcgccggagccggtcggagccggtgccggtg 

a HDLS SYQVGTAASASLGHGH 

b TTSAATR*EPRPRPASATAT 

C RPQQLPGRNRGLGQPRPRPR- 

gccgcgcgcccccgacacttgggctcgtgcccaggcgccctctccgccgcgctccctggt 

1861 + + + + + + 1920 

cggcgcgcgggggctgtgaacccgagcacgggtccgcgggagaggcggcgcgagggacca 
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Figure 8-7 

a AARPRHLGS C PGALS A A L P G 

- b PRAPDTWARAQAPSP PRS L V 

c RAPPTLGLVPRRPLRRAPWW- 

ggccgctcgctagagcacgcgcgccgcagacctagggtatttgcggatcagcgtcctcgc 
1921 + + + + 4- + 1980 

ccggcgagcgatctcgtgcgcgcggcgtctggatcccataaacgcctagtcgcaggagcg 

a GRSLEHARRRPRVFADQRPR 
b A A R * STRAADLGYLR I SVLA 

c P LA RARA P Q T * G I C G S A S S P - 

ccatctcatcctccacactccgcccccacccacctgccccagctgctaagggtcttgacc 
1981 + - + + + + ■+- 2040 

ggtagagtaggaggtgtgaggcgggggtgggtggacggggtcgacgattcccagaactgg 

a PSHPPH SAPTHLPQLLRVLT 

b HLILHTPPPPTCPSC*GS*P- 

c I S S STLR PH P PAPAAKG L DL- 

tttttcagaaacgtgcatcttttcccagttctaattttgcacgcttgcacgtttaaagca 
2041 + + + + + + 2100 

aaaaagtctttgcacgtagaaaagggtcaagattaaaacgtgcgaacgtgcaaatttcgt 

a FFRNVHLFPVLI LHACTF KA 

b FSETCIFSQF* FCTLARLKQ 

c FQKRASFPS SNFARLiHV* SR- 

ggagggatgaattcggtagtggataaatcagcaactttaggatagcttatgcagaaacgc 

2101 + + + + + + 2160 

cctccctacttaagccatcacctatttagtcgttgaaatcctatcgaatacgtctttgcg 

a GGMNSVVDKSATLG* LMQKR 

b EG* IR*WIMQQL*DSLCRNA 

c RDEFGSG* I SNFRIAYAETR- 

gtgtattctctacttttccggcagtgatcggaagagctctcaaaattggcttcagccaaa 
2161 + + + + + + 2220 

cacataagagatgaaaaggccgtcactagccttctcgagagttttaaccgaagtcggttt 

a VYSLLFRQ* SEELSKLASAK 

b CILYFSGSDRKSSQNWLQPK 
c VFSTFPAVI GRALKI GFSQR- 
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Figure 8-8 

gggctcagatgggaatggccaggtcagccatggagtttccccatgcatgtttgtgtcctg 

2221 + + + + + + 2280 

cccgagtctacccttaccggtccagtcggtacctcaaaggggtacgtacaaacacaggac 

a GLRWEWPGQPWSFPMHVCVL 

b GSDGNGQVSHGVSPCMFVSC- 

c A QMGMARSAME F PHACL CPV- 

ttgagacgtgttctaagtccactggtctccgtgcgtgatgtgcccaggaagtgtcctatt 

2281 + + + + + + 2340 

aactctgcacaagattcaggtgaccagaggcacgcactacacgggtccttcacaggataa 

a LRRVLSPLVSVRDVPRKCPI 

b *DVF*VHWSPCVMCPGSVLL- 

c ETCSKSTGLRA* CAQEVSYC- 

gtcttactgatcttgtatcttcatttgagaatcgcttagatttaaaagaaaaagggggtg 

2341 + + + + + + 2400 

cagaatgactagaacatagaagtaaactcttagcgaatctaaattttctttttcccccac 

a VLLILYLHLiRIA* I * KKKGV 

b SY* SCIFI *ESLRFKRKRGW- 

c LTDLVS S FENRLDL KEKGGG- 

ggacggggggctgggagtcaggtgtcagcgaggtttgcagaagtggagggagacgggagg 

2401 + + + + + + 2460 

cctgccccccgaccctcagtccacagtcgctccaaacgtcttcacctccctctgccctcc 

a GRGAGSQVSARFAEVEGDGR 

b DGGLGVRCQRGLQKWRETGG 

c TGGWESGVS EVCRS GGRREE- 

aggccagggggaaggggtagcaagtggtttgcgaaggaagttgctgtttgcaaggatgag 

2461 + + + + + + 2520 

tccggtcccccttccccatcgttcaccaaacgcttccttcaacgaeaaacgttcctactc 

a RPGGRGSKWFAKEVAVCKDE 

b GQGEGVASGLRRKLLFARMS- 

c ARGKG*QVVCEGSCCLQG*V- 

tctggggagattctctgtgtctgtttcaggacaccatccagcagctggaaaatgagcttc 

2521 + + + + + + 2580 

agacccctctaagagacacagacaaagtcctgtggtaggtcgtcgaccttttactcgaag 
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Figure 8-9 

a SGEILCVCFRTPSSSWKMSF 

b LiGRFSVSVSGHHPAAGK*AS 

c WGDSLCLFQDTI QQLENELR- 

ggggcacaaagtgggaaatggctcgtcatttgcgcgaataccaggacctcctcaacgtca 

2581 + + + + + + 2640 

ccccgtgtttcaccctttaccgagcagtaaacgcgcttatggtcctggaggagttgcagt 

a GAQSGKW LVI CANTRTSSTS 

b GHKVGNGSSFARI PGPPQRQ 

c GTKWEMARH LREYQ DLLNVK- 

agatggctctggatatagaaatcgctgcgtacaggtacgatgcttactacgtgcgtggcc 

2641 + + + + + 2700 

tctaccgagacctatatccttagcgacgcatgtccatgctacgaatgatgcacgcaccgg 

a RWLWI *KSLRTGTMLTTCVA 

b DGSGYRNRCVQVRCLLRAWP- 

c MALD I E I AAYRY DA Y YVRG R - 

ggaacactaaccgcagtgcagaggctgttccggcagagcttccaccacttaagttaaagc 

2701 + + + + + + 2760 

ccttgtgattggcgtcacgtctccgacaaggccgtctcgaaggtggtgaattcaatttcg 

a GTLTAVQRLFRQSFHHLS * S 

b EH*PQCRGCSGRASTT*VKA- 

c NTNRSAEAVPAELP PLKLKQ- 

aggcagggtgcaggcatcaacccagcacctggttatcttgcttacttaaaaagaaattat 

2761 + + + + + + 2820 

tccgtcccacgtccgtagttgagtcgtggaccaatagaacgaatgaatttttctttaata 

a RQGAGINSAPGYLAYLKRNY 

b GRVQASTQHLVILLT*KEI I 

c AGCRHQLSTWLSCLLKKKLF- 

tctaaagaattgcaagtgtagttttatctctttttatgcagctttaaaagaatgaatact 

2821 + + + + + + 2880 

agatttcttaacgttcacatcaaaatagagaaaaatacgtcgaaattttcttacttatga 

a SKELQV* FYLFLCSFKRMNT 

b LKNCKCSFI SFYAALKE* I L 

C * RIASVVLSLFMQL* KNEY * - 
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Figure 8-10 

agtagaaacaaaaggtttttgaattacacaaaggaggtgcagattaatctcaatgcacat 

2881 + + + + + + 2940 

tcatctttgttttccaaaaacttaatgtgtttcctccacgtctaattagagttacgtgta 

a SRNKRFLNYTKEVQINLNAH 

b VETKGF* ITQRRCRLISMHM- 

c * KQKVFELHKGGAD* SQ CTC- 

gcttaaactttttatggaaaaatgttttcaaatgctggaagcatgaacagagttttggtt 

2941 + + + + + + 3000 

cgaatttgaaaaatacctttttacaaaagtttacgaccttcgtacttgtctcaaaaccaa 

a A*TFYGKMFSNAGSMNRVLV 

b LKLFMEKCFQMLiEA* TEFWF 

c LNFLWKNVF KCW KHEQS FG F - 

tctaatattccatctagtggtttcagcttttcaaatgtataatgtcaaggacaaacacca 

3001 + + + + + - - + 3060 

agattataaagtagatcaccaaagtcgaaaagtttacatattacagttcctgtttgtggt 

a SNISSSGFSFSNV*CQGQTP 

b LI FHLVVSAFQMYNVKDKHQ 

c * Y F I *WFQLFKC IMSRTNTR- 

ggacgttctat ttctctgtt tctctgttatatagct tactattgccatcatctggctgag 

3061 -r + + + + + 3120 

cctgcaagataaagagacaaagagacaatatat cgaatgataacggtagtagaccgactc 

a GRSI SLFLCYIAYYCHHLAE 

b DVLFLCFSVI*LTIAIIWIiR- 

c TFYFSVSLLYSLLLPSSG* E - 

aatagatatagaatgatagaatatagatatagttcttttatatatgtagataatttatat 

3121 + + + + + + 3180 

ttatctatatcttactatcttatatctatatcaagaaaatatatacatctattaaatata 

a NRYRM IEYRYSS F IYVDNLY 

b IDIE**NIDIVLLYM*IIYM- 

c *I*NDRI*I*FFYICR*FIC- 

gtattatattttatgctagactgtagtataaattatatcaatatatcatgtatgatatta 

3181 + + + + + + 3240 

cataatataaaatacgatctgacatcatatttaatatagttatatagtacatactataat 
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Figure 8-11 

a VLYFMLDCS I N Y I N I SCMI L 

b YYILC*TVV*IISIYHV*Y*- 

c I IFYARL*YKLYQYIMYDIN- 

atctagatctatagatacacatatgtgcatatgcatataaatctagatatatagacacaa 

3241 + + + + + + 3300 

tagatctagatatctatgtgtatacacgtatacgtatatttagatctatatatctgtgtt 

a I * IYRYTYVH MHINLDI * TQ 

SRSIDTHMCICI*I*IYRHK- 
c LDL * I HI CAYAYKSRYI DTN- 

atatatacgaccgttttatagatagtgagataggttataggtctattaactgaagngacc 

3301 + + + + + + 3360 

tataeatactagcaaaatatctatcactctatccaatatccagataattgacttcactgg 

a IYMIVL,*IVR*VIGLLTEVT 

b YI*SFYR**DRL*VY*LjK*P- 

c I Y DRF I DS E I GYRS I N * S D L. - 

ttgctgttgagtaagcgcaaaggacaaaatcgttgattaaaatttttctgctaccaataa 

3361 + + + + + + 3420 

aacgacaacccattcgcgtttcctgttttagcaactaattttaaaaagacgatggttatt 

a LLLSKRKGQNR* LKFFCYQ * 

b CC*VSAKDKIVD*NFSATNK 
c AVE *AQRTKSLIKI FLLP I R - 

ggtaguCataatataacgagaataaattgcatttacagagctatctctcttttcaggaaa 

3421 + + + + + + 3480 

ccatcaatattatattgctcttatttaacgtaaatgtctcgatagagagaaaagtccttt 

a GSYNITRINCIYRAI SLFRK 

b VVII*RE* IAFTELSLFSGK- 

c * L * YNEN KLHLQSYLS FQES- 

gctgaataactacattaaatagacactttatgataaaaattatcaacaaatttataactc 

3481 + + + + + + 3540 

cgacttattgatgtaatttatctgtgaaatactatttttaatagttgtttaaatattgag 

a A E * L H * IDTL* *KLSTNL* L 

b LNNYI K*TLYDKNYQQI YNS 

c * ITTLNRHFMIKI INKF I TR- 
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Figure 8-12 

gatacacctgaaaatctaaacgtttaagaaagtgactactctcagaaaggctgtttggct 
3541 + .- + + + + + 3600 

ctatgtggacttttagatttgcaaattctttcactgatgagagtctttccgacaaaccga 

a D TPENLNV* ESDYSQ KGCLA 

b IHLKI *TFKKVTTLRKAVWL- 

c Y-T*KSKRLRK*LLSERL- FGF- 



ttggagtttgggggcgttttgtctatggcttcttgtttttttgtttttgttcttgttttt 

+ + H + + + 

aacctcaaacccccgcaaaacaaataccgaagaacaaaaaaacaaaaacaaaaacaaaaa 



a LEFGGVLFMASCF FV FVFVF 

b WSLGAFCLWLLVFLFLFLFF- 

c GVWGRFVYGFLFFC FCFCFL- 

tgctatttggccactaacaagtttttcagcatattcatgttgtacctaatggatctctac 
3661 + + + + + + 3720 

acgataaaccggtgattgttcaaaaagtcgtataagtacaacatggattacctagagatg 

a CYLATNKFFS I FMLYLMDLY 

b AIWPLTSFSAYSCCT*WIST- 

c IiFGH*QVFQHI HVVPNGSLL- 

tgcagggccaagacttagtagctgggtgtggttagtggactattgggcaaggttagtcat 
3721 + + + + + + 3780 

acgtcccggttctgaatcatcgacccacaccaatcacctgataacccgttccaatcagta 

a CRAKT* * LGVVSGLLGKVSH 

b AGPRLSSWVWLVDYWARLVI 

c QGQDLVAGCG* WTI GQ G * S L - 

tgtagggggcaactgtctggcagtccaggagaatctttctctgtcactgagtataatgta 
3781 + + + + + + 3840 

acatcccccgttgacagaccgtcaggtcctcttagaaagagacagtgactcatattacat 

a CRGQLSGSPGESFSVTEYNV 

b VGGNCLAVQENLSLSLSIM* 

c *GATVWQSRRIFLCH*V*CN- 

atatgccagtaagtgatagcaggtattatagtgaattcatagaatattctacttatgtaa 
3841 + + + + + + 3900 

tatacggtcattcactatcgtccataatatcacttaagtatcttataagatgaatacatt 
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Figure 8-13 

a ICQ*VIAGIIVNS*"N!LIjM* 

b Y A S K * *QVL* * IHRI FYLCN- 

c MPVSDSRYYSEFI EYSTYVI- 

ttctatttattcaaaggtagctaccacaatacccagaatgtaatgaagctcagaaggcct 

3901 + + + + + + 3960 

aagataaataagtttccatcgatggtgttatgggtcttacattacttcgagtcttccgga 

a FYLFKGSYHNTQNVMKLRRP 

b SIYSKVATTIPRM**SSEGL- 

c LFIQR* LPQYPECNEAQKA *- 

agtgaaatttttactatgtcttatgttcttggattttctccttagaaaactcctggaggg 

3961 + + + + + + 4020 

tcacCttaaaaatgatacagaacacaagaacctaaaagaggaatct t c tgaggacctccc 

a SEI FTMSYVLGFS P * KTPGG 

b VKFLLCLMFLDFLLRKLLEG- 

C *NFYYVLCS.WIFSLENSWRV- 

tgaagagactagatttagcacatttgcaggaagcatcactgggccactgtatacacaccg 

4021 + + + + + + 4080 

acttctctgatctaaatcgtgtaaacgtccttcgtagtgacccggtgacatatgtgtggc 

a * R D * I *HICRKHHWATVYTP 

b EETRFSTFAGSITGPLiYTHR 

c KRLDLAHLQEASLGHC I HTD- 

acccccaatcacaatatccagtaagattcagaaaaccaaggtggaagctcccaagcttaa 

4081 + + + + + + 4140 

£99999^ tagtgttataggtcattctaagtcttttggttccaccttcgagggttcgaatt 

a TPNHNI Q*DSENQGGS S Q A * 

b PPITI SSKIQKTKVEAPKLK- 

c PQSQY PVRFRKPRWKLP S L R - 

ggtccaacacaaatttgtcgaggagatcatagaggaaaccaaagtggaggatgagaagtc 

4141 + + + + + + 4200 

ccaggttgtgtttaaacagctcctctagtatctcctttggtttcacctcctactcttcag 

a GPTQI CRGDHRGNQSGG * EV 

b VQHKFVEEI I EETKVEDEKS 

c SNT 'NLSRRS * RKPKWRMRSQ- 
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Figure 8-14 

agaaatggaagaggccctgacagccattacagaggaattggccgcttccatgaaggaaga 
4201 + + + + + + 4260 

tctttaccttctccgggactgtcggtaatgtctccttaaccggcgaaggtacttccttct 

a RNGRGPDSHYRGIGRFHEGR 

b EMEEALTAI TE ELAASMKEE 

c KWKRP * QPLiQRNWPLP * R K R - 



gaagaaagaagcagcagaagaaaaggaagaggaacccgaagctgaagaagaagaagtagc 

+ h H H + j. 

cttctttcttcgtcgccttcttttccttctccttgggcttcgacttcttcttcttcatcg 



a EERSSRRKGRGTRS* RRRSS 

b KKEAAEEKEEE PEAEEEEVA 

c RKKQQKKRKRNPKLKKKK* I 



tgccaaaaagtctccagtgaaagcaactgcacctgaagttaaagaagaggaaggggaaaa 

+ + + H -f + 

acggtttttcagaggtcactttcgttgacgtggacttcaatttcttctccttcccctttt 



a CQKVSSESNCT* S*RRGRGK 

b AKKSPVKATAPEVKEEEGEK- 

c PKSLQ* KQLHLKLKKRKGKR- 



ggaggaagaagaaggccaggaagaagaggaggaagaagatgagggagctaagtcagacca 

4 I I H h (- 

cctccttcttcttccggtccttcttctcctccttcttctactccctcgattcagtctggt 



a GGRRRPGRRGGRR*GS * V R P 

b EEEEGQEEEEEEDEGAKSDQ 

c RKKKARKKRRKKMRELSQTK- 

agccgaagagggaggatccgagaaggaaggctctagtgaaaaagaggaaggtgagcagga 
4441 + + + + + + 4500 

tcggcttctccctcctaggctcttccttccgagatcactttttctccttccactcgtcct 

a SRRGRIREGRL* *KRGR*AG 

b AEEG GSEKEGSSEKEEGEQE- 

c PKREDPRRKALVKKRKVSRK- 



agaaggagaaacagaagctgaagctgaaggagaggaagccgaagctaaagaggaaaagaa 
+ + + + ^ + 

tcttcctctttgtcttcgacttcgacttcctctccttcggcttcgatttctccttttctt 



Figure 8-15 

a RRRNRS * S * RRGSRS * RGKE 

b EGETEAEAEGEEAEAKEEKK 

c KEKQKLKLKERKPKLKRKRK- 

agtggaggaaaagagtgaggaagtggctaccaaggaggagctggtggcagatgccaaggt 

4561 + + + -- + + + 4620 

tcacctccttttctcactccttcaccgatggttcctcctcgaccaccgtctacggttcca 

a SGGKE*GSGYQGGAGGRCQG 

b VEEKSEEVATKEELVADAKV- 

-c WRKRVRKWL PRRSWWQM P R W - 

ggaaaagccagaaaaagccaagtctcctgtgccaaaatcaccagtggaagagaaaggcaa 

4621 + + + + + + 4680 

cctttrcggtctttttcggttcagaggacacggttttagtggtcaccttctctttccgtt 

a GKARKSQVSCAKITSGRERQ 

b EKPEKAKS PVPKS PVEEKGK 

c KSQKKPSLLCQNHQWKRKAS- 

gtctccrgtgcccaagtcaccagtggaagagaaaggcaagtctcctgtgcccaagtcacc 

4681 + + + + + + 4740 

cagaggacacgggttcagtggtcaccttctctttccgttcagaggacacgggttcagtgg 

a VSCAQVTSGRERQVSCAQVT 

b SPVPKSPVEEKGKSPVPKSP- 

c LLCPSHQWKRKASLLCPSHQ- 

agtggaagagaaaggcaagtctcctgtgccgaaatcaccagtggaagagaaaggcaagtc 

4741 + + + + + + 4800 

tcaccttctctttccgttcagaggacacggctttagtggtcaccttctctttccgttcag 

a SGRERQVSCAEITSGRERQV 

b VEEKGKSPVPKSPVEEKGKS 

c WKRKASL LCRNHQWKRKASL- 

tcctgtgtcaaaatcaccagtggaagagaaagccaaatctcctgtgccaaaatcaccagt 

4801 + + + + + + 4860 

aggacacagttttagtggtcaccttctctttcggtttagaggacacggttttagtggtca 

a SCVKITSGRESQISCAKITS 

b PVSKSPVEEKAKSPVPKSPV- 

c LCQNHQWKRKPNLLCQNHQW- 
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Figure 8-16 



ggaagaggcaaagtcaaaagcagaagtggggaaaggtgaacagaaagaggaagaagaaaa 

4861 + ■ + + + + + 4920 

ccttctccgtttcagttttcgtcttcacccctttccacttgtctttctccttcttctttt 

a GRGKVKSRSGER* TERGRRK 

b EEAKSKAEVGKGEQKEEEEK- 

c KRQSQKQKWGKVNRKRKKKR- 

ggaagtcaaggaagctcccaaggaagagaaggtagagaaaaaggaagagaaaccaaagga 

4921 + + + + + + 4980 

ccttcagttccttcgagggttccttctcttccatctctttttccttctctteggtttcct 

a G.SQGSSQGREGREKGRETKG 

b EVKEAPKEEKVEKKEEKPKD 

c KSRKLpPRKRR* RKRKRNQRM- 

tgtgccagagaagaagaaagctgagtcccctgtaaaggaggaagctgtggcagaggtggt 

4981 + + -+- + + + 5040 

acacggtctcttcttctttcgactcaggggacatttcctccttcgacaccgcctccacca 

a CAREEES *VPCKGGSCGRGG 

b VPEKKKAESPVKEEAVAEVV- 

c CQRRRKLS PL * RRKLWQRWS- 

caccatcaccaaatcggtaaaggtgcacttggagaaagagaccaaagaagaggggaagcc 

5041 + + + + + + 5100 

gtggtagtggtttagccatttccacgtgaacctctttctctggtttcttctccccttcgg 

a HHHQI GKGALGERDQRRGEA 

b TITKSVKVHLEKETKEEGKP 

c PSPNR *RCTWRKRPKKRGSH- 

actgcagcaggagaaagagaaggagaaagcgggaggagagggaggaagtgaggaggaagg 

5101 + + + + + + 5160 

tgacgtcgtcctctttctcttcctctttcgccctcctctccctccttcactcctccttcc 

a TAAGEREGESGRRGRK* GGR 

b LQQEKEKEKAGGEGGS EEEG 

c CSRRKRRRKREEREEVRRKG- 

gagtgataaaggtgccaagggatccaggaaggaagacatagctgtcaatggggaggtaga 

5161 + + + + + + 5220 

ctcactatttccacggttccctaggtccttccttctgtatcgacagttacccctccatct 
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Figure 8-17 

a E* *RCQGIQEGRHSCQWGGR 

b SDKGAKGSRKEDIAVNGEVE 

c VI KVPRDPGRKT* LSMGR* K - 

aggaaaagaggaggtagagcaggagaccaaggaaaaaggcagtgggagggaagaggagaa 

5221 + + + + + + 5280 

tccttttctcctccatctcgtcctctggttcctttttccgtcaccctcccttctcctctt 

a RKRGGRAGDQGKRQWEGRGE 

b~ GKEEVEQETKEKGSGREEEK- 

c EKRR* SRRPRKKAVGGKRRK- 

aggcgttgtcaccaatggcctagacttgagcccagcagatgaaaagaaggggggtgataa 

5281 + + + + + + 5340 

tccgcaacagtggttaccggatctgaactcgggtcgtctacttttcttccccccactatt 

a RRCHQWPRLEPSR* KEGG* * 

b GVVTNGLDLS PADEKKGGDK 

c ALSPMA*T*AQQMKRRGVIK- 

aagtgaggagaaagtggtggtgaccaaaacggtagaaaaaatcaccagtgaggggggaga 

534! + + + + + + 5400 

ttcactcctctttcaccaccactggttttgccatcttttttagtggtcactcccccctct 

a K*GESGGDQNGRKNHQ*GGR 

b SEEKVVVTKTVEKI TSEGGD 

C VRRKWW* PKR* KKS PVRGEM- 

tggtgctaccaaatacatcaccaaatctgtaaccgtcactcaaaaggttgaagagcacga 

5401 + + + + + + 5460 

accacgatggtttatgtagtgatttagacattggcagtgagttttccaacttctcgtact 

a WCYQIHH*ICNRHSKG*RA* 

b GATKYI TKSVTVTQKVEEHE 

C VLPNTSLNL* PSLKRLKSMK- 

agagacctttgaggagaaactagtgtctactaaaaaggtagaaaaagtcacttcacacgc 

5461 + + + + + + 5520 

tctctggaaactcctctttgatcacagatgatttttccatctttttcagtgaagtgtgcg 

a RDL * GETSVY* KGRKSHFTR 

b ETFEEKLVSTKKVEKVTSHA- 

c RPLtRRN*CLiLKR* KKSLHTP- 
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Figure 8-18 

catagtaaaggaagtcacccagagtgactaagatttgagtccattgcaaaaggttaagcc 

5521 + + + + + + 5580 

gtatcatttccttcagtgggtctcactgattctaaactcaggtaacgttttccaattcgg 

a HSKGSHPE* LRFESIAKG*A 

b IVKEVTQSD* DLSPLQKVKP- 

c **RKSPRVTKI*VHCKR-LSH- 

atatgacaatttcaaaatgcatgtgattggcagcttcaaaacagaacgggttctcccatg 

5581 + + + + + + 5640 

tatactgttaaagttttacgtacactaaccgtcgaagttttgtcttgcccaagagggtac 

a I*QFQNACDWQLQNRTGSPM 

b YDNFKMHVI GS FKTERVLPW 

c MTISKCM* LAASKQNGFSHG- 

ggggctccagacattgtattttactttgtgcaatatgaggggactgcatgcaagctcagg 

5641 + + + + + + 5700 

ccccgaggtctgtaacataaaatgaaacacgttatactccccCgacgtacgttcgagtcc 

a GAPDIVFYFVQYEGTACKLR 

b GLQTLYFTLCNMRGLHASSG- 

c GSRHCILLCAI *GDCMQAQG- 

gtgctccctcctcagtctttgggggattcaaatgcatgatattgtatgtacctgggaaat 

5701 + + + + + + 5760 

cacgagggaggagtcagaaaccccctaagtttacgtactataacatacatggacccttta 

a VLPPQSLGDSNA * YCMYLGN 

b CSLLSLWGIQMHDIVCTWEI- 

c APSSVFGGFKCM I LYVPGKF- 

ttgccgatttcctaagctgttggaagggggtcacttaaggggggatgtcttgagatgtat 

5761 + + + + + + 5820 

aacggctaaaggattcgacaaccttcccccagtgaattcccccctacagaactctacata 

a LPIS*AVGRGSLKGGCL»EMY 

b CRFPKLLEGGHLRGDVLRCI- 

c ADFLSCWKGVT*GGMS * DVL- 

tatgcaaagtaccaactgagccaaaaacaataaatgaaacacagaactcagccttaagaa 

5821 + + + + + + 5880 

atacgtttcacggttgactcggtttttgttatttactttgtgtcttgagtcggaattctt 
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Figure 8-19 

a YAKYQLSQKQ *MKHRTQP * E 

b MQSTN*AKNNK*NTELSLKK- 

c CKVPTE P KT I NETQNSALRK- 

agctatatatgaataattatgtttacctcactggtgcatttaaaatggacttttgttcat 

5881 + + + + + + 5940 

tcgatatatacttattaatacaaatggagtgaccacgtaaattttacctgaaaacaagta 

a S Y I * I IMFTSLVHLKWTFVH 

b AIYE*LCLPHWCI * N G L L F M 

c LYMNNYVYLTGAFKMDFCSW- 

gggagaacctcgttgacatgcacagtctgcaatcttatgttgatcgatgttaaacgtcac 

5941 + + + + + 6000 

cccccttggagcaactgtacgtgtcaaacgttagaatacaactagctacaatttgcagtg 

a GRTSLTCTVCNLMLIDVKRH 

b GEPR*HAQFAILC*SMLNVT- 

C ENLVDMHSLQSYVDRC * TSQ- 

agcagtacttgctcaataaaggtcatattggaaacatagtcaattgctgagtcttatgtc 

6001 + + + + + + 6060 

tcgtcatgaacgagttatttccagtataacctttgtatcagttaacgactcagaatacag 

a SSTCSIKVILET*SIAESYV 

b AVLAQ*RSYWKHSQLLSLMS- 

c QYLLNKGH I GNI VNC * VLCH- 

atttccctttttctaatttttatttatttatttttatttagagatggggtcttgctatgt 

6061 + + + + - - - + + 6120 

taaagagaaaaagattaaaaataaataaataaaaataaatctctaccccagaacgataca 

a I SLFLI FI YLFLFRDGV LLC 

b FLiFF* FLFIYFYLEMGSCYV- 

c FSFSNFYLF I F I * RWGLAMW- 

ggccccaagcagtcctcccacctcagccacccaaagtgctgggattacaggcatgagcca 

6121 + + + + + + 6180 

ccggagttcgtcaggagggtggagtcggtgggtttcacgaccctaatgtccgtactcggt 

a GLKQSSHLSHPKCWDYRHEP 
b ASSSPPTSATQSAGITGMSH 
c PQAVLPPQPPKVLGLQA* AT- 
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Figure 8-2 0 

ccacgcccagcctgttatgccatttcaaagtgaaatctccactacctgaagct tgc 

6181 + + + + + 6236 

ggtgcgggtcggacaatacggtaaagtttcactttagaggtgatggacttcgaacg 

a PRPACYAI SK*NLHYLKL - 

b HAQPVMPFQSEI S T T * S L 

c TPSLLCHFKVKS PLPEA C - 

Enzymes that do cut : 

NONE 

Enzymes that do not cut : 
Not I 



% 

WO 98/45322 



% 

72/169 



PCT/1B98/00705 



Figure 9-1 Neurofilament H 

(Linear) MAP of: hsnfhl.gcg check: 1349 from: 1 to: 1162 

ID HSNFH1 standard; DNA; HUM; 1162 BP. 
XX 

AC X15306; X12501; 
XX 

NI g35028 

XX . . . 

With 1 enzymes: NOTI 

October 31, 1996 14:30 

ccactccggagtcctctgcccgcttcccgacctcgagggtctcctctgacgcgcagcgtc 
! + + + + + + 60 

ggtgaggcctcaggagacgggcgaagggctggagctcccagaggagactgcgcgtcgcag 

a PLRSPLPASRPRGSPLTRSV 

b HSGVLCPLPDLEGLL*RAAS- 

C TPES SARFPTSRVS S DAQRR- 

gat tccccttccctcctcggtcccctgccccgcccctctcactgcgcggagccggtcgcc 

61 + + + + + + 120 

ct aaggggaagggaggagccaggggacggggcggggagagtgacgcgcctcggccagcgg 

a DSPSLLGPLPRPSHCAEPVA 

b IPLPSSVPC PAPLTARSRSP- 

c FPFPPRSPAPPLSLRGAGRR- 

ggggggccgcaggggaggaggcggagaggcggggccctcctccccaccctctcactgcca 

121 + + + + + + 180 

ccccccggcgtcccctcctccgcctctccgccccgggaggaggggtgggagagtgacggt 

a GGPQGRRRRGGALLPTLSLP 

b GGRRGGGGEAG P S S P PS HCQ 

c GAAGEEAERRGPP PHPLTAK- 

aggggttggacccggccgcggcggctataaaagggccggcgccctggtcgtgccgcagtg 

181 + + + + + + 240 

tccccaacctgggccggcgccgccgatattttcccggccgcgggaccagcacggcgtcac 

a RGWTRPRRL* KGRRPGRAAV 

b GVGPGRGGYKRAGALVVPQC 

C GLDPAAAAI KGPAPWSCRSA- 
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Figure 9-2 

cctcccgccccgtcccggcctcgcgcacctgctcaggccatgatgagcttcggcggcgcg 
241 + + + + + + 3 

ggagggcggggcagggccggagcgcgtggacgagtccggtactactcgaagccgccgcgc 



a PPAPSRPRAPAQAMMSFGGA 
b LPPRPGLAHLLRP* *ASAAR 

c SRPVPASRTCSGHDELRRRG 



gacgcgctgctgggcgccccgttcgcgccgctgcatggcggcggcagcctccactacgcg 

H (- | ( h 

ctgcgcgacgacccgcggggcaagcgcggcgacgtaccgccgccgtcggaggtgatgcgc 



a DALLGAPFAP LHGGGS LHYA 

b TRCWAPRSRRCMAAAASTTR 

c RAAGRPVRAAAWRRQ p p L RA- 

ctagcccgaaagggtggcgcaggcgggacgcgctccgccgctggctcctccagcggcttc 
361 + + + + + + 42Q 

gatcgggctttcccaccgcgtccgccctgcgcgaggcggcgaccgaggaggtcgccgaag 

a LARKGGAGGTRSAAGS S SG F 

b * PERVAQAGRAPPLAPPAAS 

C S PKGWRRRDALRRWLLi'QR L P - 



cactcgtggacacggacgtccgtgagctccgtgtccgcctcgcccagccgcttccgtggc 
+ + + + + + 

gtgagcacctgcgcctgcaggcactcgaggcacaggcggagcgggtcggcgaaggcaccg 



a HSW-TRTSVSSVSASPSRFRG 

b TRGHGRP *APC PPRPAASVA 

c LVDTDVRELRVRLAQPLPWR- 

gcaggcgccgcctcaagcaccgactcgctggacacgctgagcaacgggccggagggctgc 
481 + + + + + + 54Q 

cgtccgcggcggagttcgtggctgagcgacctgtgcgactcgttgcccggcctcccgacg 

a AGAASSTDSLDTLSNGPEGC 

b QAPPQAPTRWTR*ATGRRAA 

C RRRLKHRLAGHAEQRAGG LH- 



a tggtggcggtggccacctcacgcagtgagaaggagcagctgcaggcgctgaacgaccgc 
+ + + + + + 

taccaccgccaccggtggagtgcgtcactcttcctcgtcgacgtccgcgacttgctggcg 
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Figure 9-3 

a MVAVATSRS E KE QLQALNDR 

b WWRWPPHAVRRSSCRR*T TA- 

C GGGGHLTQ* EGAAAGAERPL- 

ttcgccgggtacatcgacaaggtgcggcagctggaggcgcacaaccgcagcctggagggc 

601 + + + + - + + 660 

aagcggcccatgtagctgttccacgccgtcgacctccgcgtgttggcgtcggacctcccg 

a FAGYI DKVRQLEAHNRSLEG 

b SPGTSTRCGSWRRTTAAWRA- 

c RRVHRQGAAAGGAQ PQPGGR- 

gaggctgcggcgctgcggcagcagcaggcgggccgctccgctatgggcgagctgtacgag 

661 + + + + + + 720 

ctccgacgccgcgacgccgtcgtcgtccgcccggcgaggcgatacccgctcgacatgctc 

a EAAALRQQQAGRSAMGELYE 

b RLRRCGS SRRAAPLWASCTS 

c GCGAAAAAGG PLRYGRAVRA- 

cgcgaggtccgcgagatgcgcggcgcggtgctgcgcctgggcgcggcgcgcggtcagcta 

721 + + + + + + 780 

gcgctccaggcgctctacgcgccgcgccacgacgcggacccgcgccgcgcgccagtcgat 

a REVREMRGAVLRLGAARGQ L 

b ARSARCAARCCAWARRAVS Y ( - 

c RGPRDARRGAAPGRGARSAT- 

cgcctggagcaggagcacctgctcgaggacatcgcgcacgtgcgccagcgcctagacgac 

781 + + + + + + 840 

gcggacctcgtcctcgtggacgagctcctgtagcgcgtgcacgcggtcgcggatctgctg 

a RLEQEHLLEDIAHVRQRLDD 
b AWSRSTCSRTSRTCASA*TT 

PGAGAPARGHRARAPAPRRR- 

gaggcccggcagcgagaggaggccgaggcggcggcccgcgcgctggcgcgcttcgcgcag 

841 + + + + + + 900 

ctccgggccgtcgctctcctccggctccgccgccgggcgcgcgaccgcgcgaagcgcgtc 

a EARQREEAEAAARALARFAQ 

b RPGSERRPRRRPARWRASRR 

c GPAARGGRGGG PRAGALRAG- 
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Figure 9-4 

gaggccgaggcggcgcgcgtggacctgcagaagaaggcgcaggcgctgcaggaggagtgc 

901 -h + + + + + 960 

ctccggctccgccgcgcgcacctggacgtcttcttccgcgtccgcgacgtcctcctcacg 

a EAEAARVDLQKKAQALQEEC 

b RPRRRAWTCRRRRRRCRRSA- 

C GRGGARG PAEEGAGAAGGVR- 

ggctacctgcggcgccaccaccaggaagaggtgggcgagctgctcggccagatccagggc 

961 + + + + + + 1020 

ccgatggacgccgcggtggtggtccttctccacccgctcgacgagccggtctaggtcccg 

a GYLRRHHQEEVGELLGQIQG 

b ATCGATTRKRWASCSARSRA- 

C LPAAPPPGRGGRAAR PDPGL- 

tccggcgccgcgcaggcgcagatgcaggccgagacgcgcgacgccctgaagtgcgacgtg 

1021 + + + + + + 1080 

aggccgcggcgcgtccgcgtctacgtccggctctgcgcgctgcgggact tcacgctgcac 

a SGAAQAQMQAETRDALKCDV 

b PAPRRRRCRPRRATP * SAT* 

c RRRAG ADAGRDARRPEVRRD- 

i 

acgtcggcgctgcgcgagattcgcgcgcagcttgaaggccacgcggtgcagagcacgctg 

1081 + + + + + + 1140 

tgcagccgcgacgcgctctaagcgcgcgtcgaacttccggtgcgccacgtctcgtgcgac 

a TSALREIRAQLEGHAVQSTL. 

b RRRCARFARSLKATRCRARC 

c VGAARDS RAA * R PRGAE HAA- 

cagtccgaggagtggttccgag 

1141 + + - - 1162 

gtcaggctcctcaccaaggctc 

a QSEEWFR 

b SPRSGSE- 

c VRGVVP 

Enzymes that do cut: NONE 

Enzymes that do not cut: NotI 
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Figure 10-1 Presenilin I 

(Linear) MAP of: hsu40379 check: 135 from: 1 to: 1392 

RIi;HSU40379 - Human presenilin 1-463 (AD3-3) mRNA, complete cds . 
ID HSU4 037 9 standard; RNA; HUM; 13 92 BP. 
AC U40379; 
NI gl244637 

DT 05-APR-1996 (Rel . 47, Created) 

DT 15-AUG-1996 (Rel. 48, Last updated, Version 3) . . . 
With 1 enzymes: NOTI 

atgacagagttacctgcaccgttgtcctacttccagaatgcacagatgtctgaggacaac 

! + + + + + + 60 

tactgtctcaatggacgtggcaacaggatgaaggtcttacgtgtctacagactcctgt tg 

a MTELPAPLSYFQNAQMSEDN 

b *QSYLHRCPTSRMHRCLRTT- 

C DRVTCTVVLLPECTDV* G Q P - 

cacctgagcaatactaatgacaatagagaacggcaggagcacaacgacagacggagcctt 

61 + + + + 4- + 120 

gtggactcgttatgattactgttatctcttgccgtcctcgtgttgctgtctgcctcggaa 

a HLSNTNDNRERQEHNDRRSL 

b T*AI LMTIENGRSTTTDGAL 

c PEQY* *Q*RTAGAQRQTEPW- 

ggccaccctgagccattatctaatggacgaccccagggtaactcccggcaggtggtggag 

121 + + + + + + 180 

ccggtgggactcggtaatagattacctgctggggtcccattgagggccgtccaccacctc 

a GHPEPLSNGRPQGNSRQVVE 

b A TLSHYLMDDPRVTPGRWWS 

c PP*AII*WTTPG*LPAGGGA- 

caagatgaggaagaagatgaggagctgacattgaaatatggcgccaagcatgtgatcatg 

181 + + + + + + 240 

gttctactccttcttctactcctcgactgtaactttataccgcggttcgtacactagtac 

a QDEEEDEELTLKYGAKHVIM 

b KMRKKMRS*H*NMAPSM* SC 

C R*GRR*GADI EIWRQACDHA- 



WO 98/45322 PCT/IB98/00705 

77/169 

Figure 10-2 

ctctttgtccctgtgactctctgcatggtggtggtcgtggctaccattaagtcagtcagc 

241 + + + + + + 300 

gagaaacagggacactgagagacgtaccaccaccagcaccgatggtaattcagtcagtcg 

a LFVPVTL CMVVVVA TIKSVS 

b SLSL*LSAWWWSWLPLiSQSA- 

c LCPCDSLHGGGRGYH* V S Q L - 

ttttatacccggaaggatgggcagctaatctataccccattcacagaagataccgagact 

301 + + + + + + 360 

aaaatatgggccttcctacccgtcgattagatatggggtaagtgtcttctatggctctga 

a FYTRKDGQLI YTPFTEDTET 

b FIPGRMGS*SIPHSQKIPRL- 

c LYPEGWAANLYP IHRRYRDC- 

gtgggccagagagccctgcactcaattctgaatgctgccatcatgatcagtgtcattgtt 
36! + + + + + + 420 

cacccggtctctcgggacgtgagttaagacttacgacggtagtactagtcacagtaacaa 

a VGQRALHS ILNAAIM I SVIV 

b WAREPCTQF*MLPS*SVSLL- 

c GPESPALNS ECCHHDQCHCC- 

gtcatgactatcctcctggtggttctgtataaatacaggtgctataaggtcatccatgcc 

421 + + + + + + 480 

cagtactgataggaggaccaccaagacatatttatgtccacgatattccagtaggtacgg 

a VMTILLVVLYKYRCYKVI HA 

b S*L3SWWFCINTGA1RSSMP- 

C HDYPPGGSV*IQVL*GHPCL- 

tggcttattatatcatctctattgttgctgttctttttttcattcatttacttgggggaa 

481 + + + + + + 540 

accgaataatatagtagagataacaacgacaagaaaaaaagtaagtaaatgaaccccctt 

a WLIISSLLLLFFFSFIYLGE 

b GLLYHLYCCCSFFHSFTWGK- 

c AYYI I S I VAVLF FI HL L GG S - 

gtgtttaaaacctataacgttgctgtggactacattactgttgcactcctgatctggaat 

54! + + - + + + + 600 

cacaaattttggatattgcaacgacacctgatgtaatgacaacgtgaggactagacctta 
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Figure 10-3 

a VFKTYNVAVDYITVALLIWN 

b CLKPITLLWTTLLLHS*SGI- 

c V*NL* RCCGLHYCCTPDLE F - 

tttggtgtggtgggaatgatttccattcactggaaaggtccacttcgactccagcaggca 

601 + + + + + + 660 

aaaccacaccacccttactaaaggtaagtgacctttccaggtgaagctgaggtcgtccgt 

a FGVVGMI S I HWKGPLRLQ.QA 

b L V W W E * FPFTGKVHFDS S RH 

c - WCGGNDFHSLERSTSTPAG I - 

tatctcattatgattagtgccctcatggccctggtgtttatcaagtacctccccgaatgg 

661 + + + + + + 720 

atagagtaatactaatcacgggagtaccgggaccacaaatagttcatggagggacttacc 

a YLIMISALMALVFI kylpew 

b ISL*LVPSWPWCLSSTSLNG- 

c SHYD*CPHGPGVYQVPP*MD- 

actgcgtggctcatcttggctgtgatttcagtatatgatttagtggctgttttgtgtccg 

721 + + + + + + 780 

tgacgcaccgagtagaaccgacactaaagtcatatactaaatcaccgacaaaacacaggc 

a TAWLILAVI SVYDLVAVLC P 

b LRGSSWL*FQYMI*WLFCVR- 

c CVAHLGCDFSI*FSGCFVSE- 

aaaggtccacttcgtatgctggttgaaacagcccaggagagaaatgaaacgctttttcca 

781 - + + + + + + 840 

tttccaggtgaagcatacgaccaactttgtcgggtcctctctttactttgcgaaaaaggt 

a KGPLRMLVETAQERNETLFP 

b K VHFVCWLKQPRREMKRF FQ 

C RSTSYAG* NS PG E K* NAF S S - 

gctctcatttactcctcaacaatggtgtggttggtgaatatggcagaaggagacccggaa 

841 + + + + + + 900 

cgagagtaaatgaggagttgttaccacaccaaccacttataccgtcttcctctgggcctt 

a ALIYSSTMVWLVNMAEGDPE 

b LSFTPQQWCGW* IWQKETRK- 

c SHLLLNNGVVGEYGRRRPGS- 
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Figure 10-4 

gctcaaaggagagtatccaaaaattccaagtataatgcagaaagcacagaaagggagtca 

901 + + + + + + 960 

cgagtttcctctcataggtttttaaggttcatattacgtctttcgtgtctttccctcagt 

a AQRRVSKNSKYNAE STERES 

b LKGEYPKIPSIMQKAQKGSH- 

C SKESIQKFQV*CRKHRKGVT- 

caagacactgttgcagagaatgatgatggcgggttcagtgaggaatgggaagcccagagg 

961 + + + + + + 1020 

gttctgtgacaacgtctct tact act accgcccaagtcactccttacccttcgggtctcc 

a QDTVAENDDGGFSEEWEAQR 

b KTLLQRMMMAGSVRNGKPRG 

c RHCCRE* * W R V Q * GMGS PEG- 

gacag teat ctagggcc tea tcgctctacacctgagtcacgagctgctgtccaggaactt 

1021 + + + + + + 1080 

ctgtcagtagatcccggagtagcgagatgtggactcagtgctcgacgacaggtccttgaa 

a DSHLGPHRSTPESRAAVQEL 

b TVI *GLIALHLSHELiLS RNF 

c QSSRASSLYT*VTSCCPGTF- 

tccagcagtatcctcgctggtgaagacccagaggaaaggggagtaaaacttggattggga 

1081 + + + + + + 1140 

aggtcgtcataggagcgaccacttctgggtctcctttcccctcattttgaacctaaccct 

a SSSILAGEDPEERGVKLGLG 

b PAVSSLVKTQRKGE * NLDWE 

c Q QYPRW*RPRGKGS KTW I G R - 

gatttcattttctacagtgttctggttggtaaagcctcagcaacagccagtggagactgg 

1141 + + + + + + 1200 

ctaaagtaaaagatgtcacaagaccaaccatttcggagtcgttgtcggtcacctctgacc 

a DFIFYSVLVGKASATASGDW 

b ISFSTVFWLVKPQQQPVETG- 

c FHFLQCSGW*SLSNSQWRLE- 

aacacaaccatagcctgtttcgtagccatattaattggtttgtgcct tacat tat tact c 

1201 + + + + + + 1260 

ttgtgttggtatcggacaaagcatcggtataattaaccaaacacggaatgtaataatgag 



% 
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Figure 10-5 

a NTTIACFVAILIGLCLTLLL 

b TQP*PVS*PY*LVCALHYYS- 

c HNHSLFRSHINWFVPYI I TP- 

cttgccattttcaagaaagcattgccagctcttccaatctccatcacctttgggcttgtt 

1261 + - + + + + + 1320 

gaacggtaaaagttctttcgtaacggtcgagaaggttagaggtagtggaaacccgaacaa 

a LAIFKKALPALPI S ITFGLV 

b LPFSRKHCQLFQS PS PLGLF- 

c CHFQES IASS SNLHHLWACF- 

ttctactttgccacagattatcttgtacagccttttatggaccaattagcattccatcaa 

1321 + + + + + + 1380 

aagatgaaacggtgtctaatagaacatgtcggaaaatacctggt taatcgtaaggtagt t 

a FYFATDYLVQPFMDQLAFHQ 

b STLPQIILYSIiLWTN*HSIN- 

c LLCHRLSCTAFYGPISIPSI- 

ttttatatctag 

1381 + 1392 

aaaa tat agate 

a F Y I * - 

b F I S 

c L Y L 

Enzymes that do cut : 

NONE 

Enzymes that do not cut : 
Not I 
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Figure 11-1 Presenilin II 

(Linear) MAP of: hsstm2r check: 94 8 7 from: 1 to: 2236 

RL;HSSTM2R - Homo sapiens (clone F-T03796) STM-2 mRNA, complete cds . 
ID HSSTM2R standard; RNA; HUM; 2236 BP. 
AC L43964; 
NI g951202 

DT 24 -AUG- 1995 (Rel . 44, Created) 

DT 17-FEB-1997 (Rel. 50, Last updated, Version 2) . . . 

With 1 enzymes: NOTI 

cgagcggcggcggagcaggcatttccagcagtgaggagacagccagaagcaagctattgg 
1 + + + + + + so 

gctcgccgccgcctcgtccgtaaaggtcgtcactcctctgtcggtcttcgttcgataacc 

a RAAAEQAFPAVRRQP E A S Y W " - 

b ERRRSRHFQQ* GDSQKQA I G 

C SGGGAGI S S SEETARS KLLE- 

agctgaaggaacctgagacagaagctagtcccccctctgaattttactgatgaagaaact 

61 + + + + + + 120 

tcgacttccttggactctgtcttcgatcaggggggagacttaaaatgactacttctttga 

a S * RNLRQKLVPPLNFTDEET 

b A E G T * DRS * S PL* I LLMKK.L 

C LKEPETEAS PPSEFY* * R N * - 

gaggccacagagctaaagtgacttttcccaaggtcgcccagcgaggacgtgggacttctc 

121 + + + + + + 180 

ctccggtgtctcgatttcactgaaaagggttccagcgggtcgctcctgcaccctgaagag 

a EATELK *LFPRSPSEDVGLL 

b R PQS*SDFSQGRPART W D F S 

C GHRAKVTF P KVAQRGRGT S Q - 

agacgtcaggagagtgatgtgagggagctgtgtgaccatagaaagtgacgtgttaaaaac 

181 + -- + :-- + + + + 240 

tctgcagtcctctcactacactccctcgacacactggtatctttcactgcacaatttttg 

a RRQESDVRELCDHRK* RVKN 

b DVRRVM*GSCVTIESDVLKT 

C TSGE*CEGAV*P*KVTC*KP- 
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Figure 11-2 

cagcgctgccctctttgaaagccagggagcatcattcatttagcctgctgagaagaagaa 
241 + + + + + + 300 

gtcgcgacgggagaaactttcggtccctcgtagtaagtaaatcggacgactcttcttctt 

a Q RCPL* KPGS I IHLAC * E E E 

b SAALFESQGASFI * PAEKKK - 

c ALPSLKAREHHS FS LLRRRN- 

accaagtgtccgggattcagacctctctgcggccccaagtgttcgtggtgcttccagagg 
301 + + + + + + 360 

tggttcacaggccctaagtctggagagacgccggggttcacaagcaccacgaaggtctcc 

a TKCPGFRPLCGPKCSWCFQR 

b PSVRDSDLSAAPSVRGAS RG 

c QVSGIQTSLRPQVFVVLPEA- 

cagggctatgctcacattcatggcctctgacagcgaggaagaagtgtgtgatgagcggac 

361 + + + + + + 420 

gtcccgatacgagtgtaagtaccggagactgtcgctccttcttcacacactactcgcctg 

a QGYAHIHGL*QRGRSV* *AD 

b RAMLTF MASDSEEEVCDERT- 

c GLCSHSW PLTARKKCVMS GR- 

gtccctaatgtcggccgagagccccacgccgcgctcctgccaggagggcaggcagggccc 

421 + + + + + + 480 

ca g99 a ttacagccggctctcggggtgcggcgcgaggacggtcctcccgtccgtcccggg 

a VPNVGRE PHAALLPGGQAG P 

b S L M S AES PT P RSCQEGRQG P 

c P*CRPRAPRRAPARRAGRAQ- 

agaggatggagagaacactgcccagtggagaagccaggagaacgaggaggacggtgagga 
481 + + + + + + 54Q 

tctcctacctctcttgtgacgggtcacctcttcggtcctcttgctcctcctgccactcct 

a RGWREHCPV EKPGERGGR * G 

b EDGENTAQWRSQENEE D G E E 

C RMERTLP S GEARRTRRTVRR- 

N 

o 

t 

I 
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Figure 11-3 

ggaccctgaccgctatgtctgtagtggggttcccgggcggccgccaggcctggaggaaga 
541 + + + + + + 

cctgggactggcgatacagacatcaccccaagggcccgccggcggtccggacctccttct 



a GP*PLCL*WGSRAAARPGGR 

b DPD RYVCSGVPGRP PGLEEE 

c TLTAMSVVGFPGGRQAWRKS- 



gctgaccctcaaatacggagcgaagcacgtgatcatgctgtttg'tgcctgtcactctgtg 

+ + 4* h + + 

cgactgggagtttatgcctcgcttcgtgcactagtacgacaaacacggacagtgagacac 



a ADPQIRSEARDHAVCACHSV 

b LTLKYGAKHVIMLFV PVTLC 

c *PSNTERST*SCCLCLSLCA- 



catgatcgtggtggtagccaccatcaagtctgtgcgcttctacacagagaagaatggaca 
h + ^ + + ^ 

gtactagcaccaccatcggtggtagttcagacacgcgaagatgtgtctcttcttacctgt 



a HDRGGSHHQVCALLiHREEWT 
b MIVVVATIKSVRFYTEKNGQ 
c *SWW*PPSSLCASTQRRMDS 



gctcatctacacgacattcactgaggacacaccctcggtgggccagcgcctcctcaactc 
+ + + + + + 

cgagtagatgtgctgtaagtgactcctgtgtgggagccacccggtcgcggaggagttgag 



a AHLHDIH*GHTLGGPAP PQL 

b LIYTTFTEDTPSVGQRLLiNS 
c SSTRHSLRTHPRWASASSTP 



cgtgctgaacaccctcatcatgatcagcgtcatcgtggttatgaccatcttcttggtggt 
+ + + + ^ + 

gcacgacttgtgggagtagtactagtcgcagtagcaccaatactggtagaagaaccacca 



a RAEHPHHDQRHRGYDHLLGG 
b VLNTLIMISVIVVMTIFLVV 

c c*tpss*sassw'l*psswwc 



gctctacaagtaccgctgctacaagttcatccatggctggttgatcatgtcttcactgat 
+ + + ^ ^ + 

cgagatgttcatggcgacgatgttcaagtaggtaccgaccaactagtacagaagtgacta 
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Figure 11-4 

a ALQVPLLQVHPWLVD HVFTD 

b LYKYRCYKFIHGWLI MSSLM 

c STSTAATSSSMAG*SCLH*C- 

gctgctgttcctcttcacctatatctaccttggggaagtgctcaagacctacaatgtggc 

901 + + + + + + 960 

cgacgacaaggagaagtggatatagatggaaccccttcacgagttctggatgttacaccg 

a AAVPLHLYLPWGSAQDLQCG 

b LLFLFTYIYLGEVLKTYNVA- 

c CCSSSPISTLGKCSRPTMWP- 

catggact accccaccctcttgctgactgtctggaacttcggggcagtgggcatggtgtg 

961 + + + + + + 1020 

gtacctgatggggtgggagaacgactgacagaccttgaagccccgtcacccgt accacac 

a HGLPHPLADCLELRGSGHGV 

b MDYPTLLLTVWNFGAVGMVC- 

C WTTPPSC*LSGTSGQWAWCA- 

catccactggaagggccctctggtgctgcagcaggcctacctcatcatgatcagtgcgct 

1021 + + + + + + 1080 

gtaggtgaccttcccgggagaccacgacgtcgtccggatggagtagtactagtcacgcga 

a HPLEGPSGAAAGLPHHDQCA 

b IHWKGPLVLQQAYL1 MI SAL 

c STGRALWCCSRPTSS*SVRS- 

catggccctagtgttcatcaagtacctcccagagtggtccgcgtgggtcatcctgggcgc 

1081 + + + + + + 1140 

gtaccgggatcacaagtagttcatggagggtctcaccaggcgcacccagt-aggacccgcg 

a HGPSVHQVPPRVVRVGHPGR 

b MALVFIKYLPEWSAWVILGA- 

C WP*CSSSTSQSGPRGSSWAP- 

catctctgtgtatgatctcgtggctgtgctgtgtcccaaagggcctctgagaatgctggt 

1141 + + + + + + 1200 

gtagagacacatactagagcaccgacacgacacagggtttcccggagactcttacgacca 

a HLCV* SRGCAVSQRASENAG 

b I SVYDLVAVLCPKGP LRMLV 

C SLCMISWLCCVPKGL*ECW*- 



WO 98/45322 PCT/IB98/00705 

85/169 

Figure 11-5 

agaaactgcccaggagagaaatgagcccatattccctgccctgatatactcatctgccat 

1201 + + + + + + 1260 

tctttgacgggtcctctctttactcgggtataagggacgggactatatgagtagacggta 

a RNCPGEK*AHIPCPDILICH 

b ETAQERNEPI FPALI Y S SAM- 

c KLPRREMSPYSLP* YTHLPW- 

ggtgtggacggttggcatggcgaagctggacccctcctctcagggtgccctccagctccc 

1261 + - + --- + + + + 1320 

cCacacctgccaaccgtaccgcttcgacctggggaggagagtcccacgggaggtcgaggg 

a GVDGWHGEAGPLLSGCPPAP 

b VWTVGMAKLD P S S QGALQ L P 

C CGRLAWRSWTPPLRVPSSSP- 

ctacgacccggagatggaagaagactcctatgacagt tttggggagccttcataccccga 

1321 + + + + + + 1380 

gatgctgggcctctaccttcttctgaggatacfcgtcaaaacccctcggaagtatggggct 

a LRPGDGRRLL*QFWGAFI PR 

b YDPEMEEDSYDSFGEPSYPE- 

C TTRRWKKTPM TVLGSLHTPK- 

agtctttgagcctcccttgactggctacccaggggaggagctggaggaagaggaggaaag 

1381 + + + + + + 1440 

tcagaaactcggagggaactgaccgatgggtcccctcctcgacctccttctcctcctttc 

a SL*ASLDWLPRGGAGGRGGK 

b VFEPPLTGYPGEELEEEEER 

c SLSLP* LATQGRSWRKRRKG- 

gggcgtgaagcttggcctcggggacttcatcttctacagtgtgctggtgggcaaggcggc 

1441 + + + + + + 1500 

cccgcacttcgaaccggagcccctgaagtagaagatgtcacacgaccacccgttccgccg 

a GREAWPRGLHLLQCAGGQGG 

b GVKLGLGDF I FYSVLVGKAA- 

C A*SLASGTSSSTVCWWARRL- 

tgccacgggcagcggggactggaataccacgctggcctgcttcgtggccatcctcattgg 



acggtgcccgtcgcccctgaccttatggtgcgaccggacgaagcaccggtaggagtaacc 
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Figure 11-6 

a CHGQRGLEYHAGLLRGHPHW 

b ATGSGDWNTTLACFVA I L I G - 

c PRAAGTGIPRWPASWPSSLA- 

cttgtgtctgaccctcctgctgcttgctgtgttcaagaaggcgctgcccgccctccccat 

1561 + + + + + + 1620 

gaacacagactgggaggacgacgaacgacacaagttcttccgcgacgggcgggaggggta 

a LVSDPPAACCVQEGAARPPH 

b LCLTLLLLAVFKKALPALPI- 

c CV*PSCCLLCSRRRCPPSPS- 

ctccatcacgttcgggctcatcttttacttctccacggacaacctggtgcggccgttcat 

1621 + + + + + + 1680 

gaggtagtgcaagcccgagtagaaaatgaagaggtgcctgttggaccacgccggcaagta 

a L1HHVRAHL L1L1LHGQ PGAAVH 

b SITFGLI FYFSTDNLVRPFM 

c PSRSGSSFTSPRTTWCGRSW- 

ggacaccctggcctcccatcagctctacatctgagggacatggtgtgccacaggctgcaa 

1681 + + + + + + 1740 

cctgtgggaccggagggtagtcgagatgtagactccctgtaccacacggtgtccgacgtt 

a GHPGLPSALHLRDMVCHRLQ 

b DTLASHQLYI *GTWCATGCK - 

c TPWPPISST SEGHGVPQA AS- 

gctgcagggaattttcattggatgcagttgtatagttttacactctagtgccatatattt 

1741 + + + + + --- + 1800 

cgacgtcccttaaaagtaacctacgtcaacatatcaaaatgtgagatcacggtatataaa 

a AAGNFHWMQLYSFTL* CH I F 

b LQGIFIGCSCIVLHSSAIYF- 

c CREFSLDAVV* FYTLVPY I F - 

ttaagacttttctttccttaaaaaataaagtacgtgtttacttggtgaggaggaggcaga 

1801 + + + + + + 1860 

aattctgaaaagaaaggaattttttatttcatgcacaaatgaaccactcctcctccgtct 

a LRLFFP* KI KYVFTW*GGGR 

b *DFSFLKK* STCLLGEEEAE 

c KTFLSLKNKVRVYLVRRRQN- 
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Figure 11-7 

accagctctttggtgccagctgtttcatcaccagactttggctcccgctttggggagcgc 

1861 + + + + + + 1920 

tggtcgagaaaccacggtcgacaaagtagtggtctgaaaccgagggcgaaacccctcgcg 

a TS SLVPAVSS PDFG S R F G ER 

b PALWCQLFHHQTLAPALGSA- 

c QLFGASCFITRLWIiPLWGAP- 

ctcgcttcacggacaggaagcacagcaggtttatccagatgaactgagaaggtcagatta 

1921 + + + + + + 1980 

gagcgaagtgcctgtccttcgtgtcgtccaaataggtctacttgactcttccagtctaat 

a LASRTGSTAGLSR* TEKVRL 

b SLHGQEAQQVYPDELRRSD* 

c RFTDRKHSRFIQMN * EGQI R- 

99gcggggagaagagcatccggcatgagggctgagatgcgcaaagagtgtzgctcgggagt 

1981 + + + + + + 2040 

cccgcccctcttctcgtaggccgtactcccgactctacgcgtttctcacacgagccctca 

a GRGEEHPA*GLRCAKSVLGS 

b GGEKSIRHEG*D.AQRVCSGV- 

c AGRRASGMRAEMRKECAREW- 

ggcccctggcacctgggtgctctggctggagaggaaaagccagttccctacgaggagtgt 

2041 + + + + + + 2100 

ccggggaccgtggacccacgagaccgacctctccttttcggtcaagggatgctcctcaca 

a G PWHLGALAGEEKP V P Y E E C 

b APGTWVLWLERK SQF PTRSV- 

c PLAPGCSGWRGKAS SLRGVF- 

tcccaatgctttgtccatgatgtccttgttattttattgcctttagaaactgagtcctgt 

2101 + + + + + + 2160 

agggttacgaaacaggtactacaggaacaataaaataacggaaatctttgactcaggaca 

a SQCFVHDVLVILLPLETESC 

b PNALSMMSLLFYCLi* KLSPV- 

c PMLCP*CPCYFIAFRN*VLF- 

tcttgttacggcagtcacactgctgggaagtggcttaatagtaatatcaataaatagatg 

2161 + + + + + + 2220 

agaacaatgccgtcagtgtgacgacccttcaccgaattatcattatagttatttatctac 
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Figure 11-8 

a SCYG-SHTAGKWLNSN I N K * M 

b LVTAVTLLGSG LIVI SINR* 

c LLRQSHCWEVA* * * YQ* I D E - 

agtcctgttagaaaaa 

2221 + 2236 

tcaggacaatcttttt 

a S P V R K 

b V L L E K 

c S C * K 

Enzymes that do cut : 

Not I 

Enzymes that do not cut : 
NONE 
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Figure 12-1 Big Tau (exon 4A) 

(Linear) MAP of: af 047858 check: 5416 from: 1 to: 954 

DL;AF047858 - Homo sapiens microtubule-associated protein tau (tau) gene, 
exon 

ID AF047858 standard; DNA; HUM; 954 BP. 
AC AF047858; M93652; 
NI g2898166 

DT 23-FEB-1998 (Rel . 54, Created) 

DT 26-FEB-1998 (Rel. 54, Last updated, Version 4) . . . 
With 1 enzymes: NOTI * start exon 4A 

gactgggccgagaagggtccggcctttccgaagcccgccaccactgcgtatctccacaca 
1 + + + + + _ + 6Q 

ctgacccggctcttcccaggccggaaaggcttcgggcggtggtgacgcatagaggtgtgt 



a *D WAEKGPAFPKPATTAYLHT 

b TGPRRVRPFRSPPPLRI STQ 

c LGREGSGLSEARHHCVS PHR- 

gagcctgaaagtggtaaggtggtccaggaaggcttcctccgagagccaggccccccaggt 
61 + + + + + + 12Q 

ctcggactttcaccattccaccaggtccttccgaaggaggctctcggtccggggggtcca 

a EPESGKVVQEGFLREPGPPG 

b SLKVVRWSRKASSESQAPQV- 

c A* KW* GGPGRLPPRARP PRS- 



ctgagccaccagctcatgtccggcatgcctggggctcccctcctgcctgagggccccaga 

- y + + H + + 

gactcggtggtcgagtacaggccgtacggaccccgaggggaggacggactcccggggtct 



a LSHQLMSGMPGAPLLPEG PR 

b *ATSSCPACLGLPSCLRAPE- 
C EPPAHVRHAWGS PPA * G PQR- 

N 
o 
t 
I 

gaggccacacgccaaccttcggggacaggacctgaggacacagagggcggccgccacgcc 
181 + - + + + + + 240 

ctccggtgtgcggttggaagcccctgtcctggactcctgtgtctcccgccggcggtgcgg 
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Figure 12-2 

a EATRQPSGTGPEDTEGGRHA 

b RPHANLRGQDL. RTQRAAATP- 

c GHTPTFGDRT* GHRGRPPRP- 

cctgagctgctcaagcaccagcttctaggagacctgcaccaggaggggccgccgctgaag 
241 + + + + + + 300 

ggactcgacgagttcgtggtcgaagatcctctggacgtggtoctccccggcggcgacttc 

a PELLKHQLLGDLHQEGPPLK 
b LSCSSTSF* ETCTRRGRR* R 

c *AAQAPASRRPAPGGAAAEG- 

ggggcagggggcaaagagaggccggggagcaaggaggaggtggatgaagaccgcgacgtc 
301 + + + + + + 36Q 

ccccgtcccccgtttctctccggcccctcgttcctcctccacctacttctggcgccgcag 

a GAGGKERPGS KEEVDEDRDV 

b GQGAKRGRGARRRWMKTATS 
C GRGQREAGEQGGGG * RPRRR- 

gatgagtcctccccccaagactcccctccctccaaggcctccccagcccaagatgggcgg 
361 + + + + + + 420 

ctactcaggaggggggttctgaggggagggaggttccggaggggtcgggttctacccgcc 

a DESSPQDSPPSKASPAQDGR 

b MSPPPKTPLPPRPPQPKMGG- 

c *VLPPRLPSLQGLPSPRWAA- 

cctccccagacagccgccagagaagccaccagcatcccaggcttcccagcggagggtgcc 
421 + + + + + + 480 

9gaggggtctgtcggcggtctcttcggtggtcgtagggtccgaagggtcgcctcccacgg 

a P PQTAAREATS I PGFPAE GA 

b LPRQPPEKPPASQASQRR VP- 

c SPDSRQRSHQHPRL PSGGCH- 

atccccctccctgtggatttcctctccaaagtttccacagagatcccagcctcagagccc 
481 + + + + + + 54Q 

tagggggagggacacctaaaggagaggtttcaaaggtgtctctagggtcggagtctcggg 

a IPLPVDFLSKVSTEIPASEP 

b SPSLWISSPKFPQRSQPQSP- 

c PPPCGFPLQSFHRDPSLRAR- 
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Figure 12-3 

gacgggcccagtgtagggcgggccaaagggcaggatgcccccctggagttcacgtttcac 
541 + + + + + + 600 

ctgcccgggtcacatcccgcccggtttcccgtcctacggggggacctcaagtgcaaagtg 

a DGPSVG'RAKGQDAP.LEFTFH 

b TGPV*GGPKGRMPPWSSRFT- 

c RAQCRAGQRAGCP PGVHVSR- 

gtggaaatcacacccaacgtgcagaaggagcaggcgcactcggaggagcatttgggaagg 
601 + + + + + + 66Q 

cacctttagtgtgggttgcacgtcttcctcgtccgcgtgagcctcctcgtaaacccttcc 

a VEITPNVQKEQAHSEEHLGR 

b WKSHPTCRRSRRTRRS IWEG- 

c GNHTQRAEGAGAL-. GGAFGKG- 

gctgcatttccaggggcccctggagaggggccagaggcccggggcccctctttgggagag 
661 + + + + + + 720 

cgacgtaaaggtccccggggacctctccccggtctccgggccccggggagaaaccctctc 

a AAFPGAPGEGPEARGPSLGE 

b LHFQG PLERGQRPGAPLWER 

G C I SRGPWRGARGPG P LFGRG- 

gacacaaaagaggctgaccttccagagccctctgaaaagcagcctgctgctgctccgcgg 
721 + + + + + + 780 

ctgtgttttctccgactggaaggtctcgggagacttttcgtcggacgacgacgaggcgcc 

a DTKEADLPE PSEKQPAAAPR 

b TQKRIiTFQS PLKSS LLLLRG 

c HKRG* PSRAL*KAACCCSAG- 

gggaagcccgtcagccgggtccctcaactcaaaggtctgtgtcttgagcttcttcgctcc 
781 + + + + + + 840 

cccttcgggcagtcggcccagggagttgagtttccagacacagaactcgaagaagcgagg 

a GKPVSRVPQLKGLCLELLRS 

b GSPSAGSLNSKVCVLSFFAP- 

c EARQPGPSTQRSVS * AS SLL- 

ttccctggggacctcccaggcctcccaggctgcgggcactgccactgagcttccaggcct 
841 + + + + + + 900 

aagggacccctggagggtccggagggtccgacgcccgtgacggtgactcgaaggtccgga 
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Figure 12-4 

a FPGDLPGLPGCGHCH * A S R P 

b SLGTSQASQAAGTATELPGL- 

C PWGPPRPPRLRALPLSFQAS- 

cccgactcctgctgcttctgacgttcctaggacgccactaaatcgacacctggg 
901 + + + + + 954 

gggctgaggacgacgaagactgcaaggatcctgcggtgatttagctgtggaccc 

a PDSCCF*RS*DATKSTPG 
b PTPAASDVPRTPLNRHL 
C ~ RLLLLLTFLGRH* I DTW 

Enzymes that do cut : 

Not I 

Enzymes that do not cut : 
NONE 
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Figure 13-1 GFAP 

(Linear) MAP of: hsgfap check: 1566 from: 1 to: 3017 

RL ; HSGFAP - Human glial fibrillary acidic protein (GFAP) mRNA, complete 

cds . 

ID HSGFAP standard; RNA; HUM; 3 017 BP. 

AC J04569; 
NI gl83074 

DT 23-APR-1990 (Rel . 23, Created) 

DT 16-DEC-1994 (Rel. 42, Last updated, Version 3) . . . 
With 1 enzymes: NOTI 

ccgatggagaggagacgcatcacctccgctgctcgccgctcctacgtctcctcaggggag 
1 + + + + + -+ 60 

ggctacctctcctctgcgtagtggaggcgacgagcggcgaggatgcagaggagtcccctc 

a PMERRRI TSAARRSYVS SGE 

b RWRGDAS PPLLAAP TS PQGR- 

c DGEETHHLRCS PLLRLLRGD- 

atgatggtggggggcctggctcctggccgccgtctgggtcctggcacccgcctctccctg 
61 + + + + + + 12Q 

tactaccaccccccggaccgaggaccggcggcagacccaggaccgtgggcggagagggac 

a MMVGGLAPGRRLGPGTRLSL 

b *WWGAWLLAAVWVLAPASPW- 

c DGGGPGSWPPSGSWHPPLPG- 

gctcgaatgccccctccactcccgacccgggtggatttctccctggctggggcactcaat 
121 + + + + + + 180 

cgagcttacgggggaggtgagggctgggcccacctaaagagggaccgaccccgtgagtta 

a AR MPPPLPTRVDFSLAG A L N 

b LECPLHSRPGWISPWLG HSM- 

c SNAPSTPDPGGFLPGWGTQC- 

gctggcttcaaggagacccgggccagtgagcgggcagagatgatggagctcaatgaccgc 
181 + + + + + + 240 

cgaccgaagttectctgggcccggtcactcgcccgtctctactacctcgagttactggcg 

a AGFKETRASERAEMMELNDR 

b LASRRPG PVSGQR * • W S S MTA 

c WLQGDPGQ*AGRDDGAQ* PL- 
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Figure 13-2 

tttgccagctacatcgagaaggttcgcttcctggaacagcaaaacaaggcgctggctgct 

24i + + + + + + 300 

aaacggtcgatgtagctcttccaagcgaaggaccttgtcgttttgttccgcgaccgacga 

a FAS Y I E KVR F LEQQ N K A L A A 

b LPATSRRFASWNSKTRRWLL 

c CQLHREGSLPGTAKQGAGC*- 

gagctgaaccagctgcgggccaaggagcccaccaagctggcagacgtctaccaggctgag 

301 + + + + + + 360 

ctcgacttggccgacgcccggttcctcgggtggttcgaccgtctgcagatggtccgactc 

a ELNQLRAKE PTKLADVYQAE 

b s*TSCGPRSPPSWQTSTRLS- 

c AEPAAGQGAHQAGRRLPG * A - 

ctgcgagagctgcggctgcggctcgatcaactcaccgccaacagcgcccggctggaggtt 

361 + + + + + + 420 

gacgctctcgacgccgacgccgagctagttgagtggcggttgtcgcgggccgacctccaa 

a LRELRLRLDQLTANSARLEV 

b CESCGCGSINSPPTAPGWRL- 

c ARAAAAARS THRQQR PAGG * - 

gagagggacaatctggcacaggacctggccactgtgaggcagaagctccaggatgaaacc 

421 + + + + + + 480 

ctctccctgttagaccgtgtcctggaccggtgacactccgtcttcgaggtcctactttgg 

a - ERDNLAQDLATVRQKLQDET 

b - ■ RGTIWHRTWPL* GRSSRMKP 

c EGQSGTGPGHCEAEAPG* NQ- 

aacctgaggccggaagccgagaacaacctggctgcctatagacaggaagcagatgaagcc 

481 + + + + + + 540 

ttggactccgaccttcggctcttgttggaccgacggatatctgtccttcgtctacttcgg 

a NLRLEAENNLAAYRQEADEA 

b T *GWKPRTTWLPIDRKQMKP 

c PEAGSREQPGCL*TGSR*SH- 

accctggcccgcctggatctggagaggaagattgagtcgctggaggaggagatccggttc 

541 + + + + + + 600 

tgggaccgggcagacctagacctctccttctaactcagcgacctcctcctctaggccaag 
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Figure 13-3 

a TLARLDLiERKI E S LEEEI RF 

b PWPVWIWRGRLSRWRRRSGS- 

c PGPSGSGEED* VAGGGD P V L - 

ttgaggaagatccacgaggaggaggttcgggaactccaggagcagctggcccgacagcag 

601 + + +■ + + + 660 

aactccttctaggtgctcctcctccaagcccttgaggticctcgtcgaccgggctgtcgtc 

a LRKIHEEEVRELQEQLARQQ 

b *GRST RRRFGNSRSSWPDSR- 

c EEDPRGGGSGTPGAAGPTAG- 

gtccatgtggagcttgacgtggccaagccagacctcaccgcagccctgaaagagatccgc 

661 + + + + + + 720 

caggtacacctcgaactgcaccggttcggtctggagtggcgtcgggactttctctaggcg 

a VHVELDVAKPDLTAALKEIR 

b SMWSLTWPSQTSPQP*KRSA- 

c PCGA*RGQARPHRS PERDPH- 

acgcagtatgaggcaatggcgtccagcaacatgcatgaagccgaagagtggtaccgctcc 

721 + + + + + + 780 

tgcgtcatactccgttaccgcaggtcgttgtacgtacttcggcttctcaccatggcgagg 

a TQYEAMAS SNMHEAEEWYRS 

b RSMRQWRPATCMKPKSGTAP 

c AV*GNGVQQHA* SRRVVPLQ- 

aagtttgcagacctgacagacgctgctgcccgcaacgcggagctgctccgccaggccaag 

781 + + + + + + 840 

ttcaaacgtctggactgtctgcgacgacgggcgttgcgcctcgacgaggcggtccggttc 

a K FAD LTDAAARN A E L L R ;Q A K 

b S LQT*QTLLPATRSCSA RPS 

c VCRPDRRCC P QRGAAP PGQA- 

cacgaagccaacgactaccggcgccagttgcagtccttgacctgcgacctggagtctctg 

841 + + + + + + 900 

gtgcttcggttgctgatggccgcggtcaacgtcaggaactggacgctggacctcagagac 

a HEANDYRRQLQSLTCDLESL 

b TKPTTTGASCSP*PATWSLC- 

C RSQRLPAPVAVLDLRPGVSA- 
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Figure 13-4 

cgcggcacgaacgagtccctggagaggcagatgcgcgagcaggaggagcggcacgtgcgg 

901 + + + + + + 960 

gcgccgtgcttgctcagggacctctccgtctacgcgctcgtcctcctcgccgtgcacgcc 

a RGTNESLERQMR. EQ EERHVR - 

b AARTSPWRGRCASRRSGTCG 

C RHERVPGEADA RAGGAARAG- 

gaggcggccagttatcaggaggcgctggcgcggctggaggaagaggggcagagcctcaag 

961 + + + + + + 1020 

ctccgccggtcaatagtcctccgcgaccgcgccgacctccttctccccgtctcggagttc 

a EAASYQEALARLEEEGQSLK 

b RRPVIRRRWRGWRKRGRASR 

C GGQLS GGAGAAGGRGAE PQG- 

gacgagatggcccgccacttgcaggagtaccaggacctgctcaatgtcaagctggccctg 

1021 + + + + + + 1080 

ctgctccaccgggcggtgaacgtcctcatggtcctggacgagttacagttcgaccgggac 

a DEMARHLQEYQDLLNVKLAL 

b TRWPATCRSTRTCSMSSWPW- 

c RDGPPLAGVPGPAQCQAGPG- 

gacatcgagatcgccacctacaggaagctgctagagggcgaggagaaccggat caeca tt 

1081 + + + + + + 1140 

ctgtagctctagcggtggatgtccttcgacgatctcccgctcctcttggcctagtggtaa 

a DIEIATYRKLLEGEENRITI 

b TSRSPPTGSC*RARRTGSPF 

C ^ HRDRHLQEAARGRGE PDHHS- 

cccgtgcagaccttctccaacctgcagattcgagaaaccagcctggacaccaagtctgtg 

1141 + + + + + + 1200 

gggcacgtctggaagaggttggacgtctaagctctttggtcggacctgtggttcagacac 

a PVQTFSNLQI RETSLDTKSV 

b PCRPSPTCRFEKPAWTPSL.C- 

c RADLLQPADS RNQPGHQVCV- 

tcagaaggccacctcaagaggaacatcgtggtgaagaccgtggagatgcgggatggagag 

1201 + + + + + + 1260 

agtcttccggtggagttctccttgtagcaccacttctggcacctctacgccctacctctc 
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Figure 13-5 

a S EGHLKRN I VVKTVE MR DGE - 

b QKATSRGTSW*RPWRCGMER 

C RRPPQEEHRGEDRGDAGWRG- 

gtcattaaggagtccaagcaggagcacaaggatgtgatgtgaggcaggacccacctggtg 
1261 + + + + + + 1320 

cagtaattcctcaggttcgtcctcgtgttcctacactacactccgtcctgggtggaccac 

a VIKESKQEHKDVM* GRTHLV 

b SLRSPSRSTRM*CEAGPTWW- 

C H*GVQAGAQGCDVRQD P PGG- 

gcctctgccccgtctcatgaggggcccgagcagaagcaggatagttgctccgcctccgct 
1321 + + + + + + 1380 

cggagacggggcagagtactccccgggctcgtcttcgtcctatcaacgaggcggagacga 

a ASAPSHEGPEQKQDS CSASA 

b PLPRLMRGPSRSRI VAPPLL 

C LCPVS * GARAEAG * LLRLCW- 

ggcacatttccccagacctgagctccccaccaccccagctgctcccctccctcctctgtc 
1381 -- + + + + + + 1440 

ccgtgtaaaggggtctggactcgaggggtggtggggtcgacgaggggagggaggagacag 

a GTFPQT*APHHPSCSPPSSV 

b AHFPRPELPTTPAAPLPPLS 

C HISPDLSSPPPQLLPSLLCP- 

Gctaggtcagcttgctgccctaggctccgtcagtatcaggcctgccagacggcacccacc 
1441 + + + + + + 1500 

ggatccagtcgaacgacgggatccgaggcagtcatagtccggacggtctgccgtgggtgg 

a PRSACCPRLRQYQACQT A PT 

b LGQLAALGSVS IRPARR HP P 

C *VSL-LP*APSVSGLPDGTHP- 

cagcacccagcaactccaactaacaagaaactcacccccaagggcagtctggaggggcat 
1501 + + + + + + i5 60 

gtcgtgggtcgttgaggttgattgttctttgagtgggggttcccgtcagacctccccgta 

a QHPATPTNKKLTPKGSLEGH 

b STQQLQLTRNSPPRAVWRGM 

c APSNSN*QETHPQGQSGGAW- 
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Figure 13-6 

ggccagcagcttgcgttagaatgaggaggaaggagagaaggggaggagggcggggggcac 

1561 + + + + + + 1620 

ccggtcgtcgaacgcaatcttactcctccttcctctcttcccctcctcccgccccccgtg 

a GQQLALE*GGRREGEEGGGH 

b ASSLR*NEEEGEKGRRAGGT 

c PAACVRMRR KERRGGGR GA P - 

ctactacatcgccctccacatccctgattcctgttgttatggaaactgttgccagagatg 

1621 + + + + + + 1680 

gatgatgtagcgggaggtgtagggactaaggacaacaatacctttgacaacggtctctac 

a LLHRPPHP* FLLLWKliLPEM 

b Y Y I A L ,H I PDSCCYGNCCQRW 

c T.TSPSTSLI PVVMETVARDG- 

gaggttctctcggagtatctgggaactgtgcctttgagtttcctcaggctgctggaggaa 

1681 + + + + + + 1740 

ctccaagagagcctcatagacccttgacacggaaactcaaaggagtccgacgacctcctt 

a EVLSEYLGTVPLSFLRIiLEE 

b RFSRSIWELCL*VSSGCWRK- 

c GSLGVSGNCAFEFPQAAGGK- 

aactgagactcagacaggaaagggaaggccccacagacaaggtagccctggccagaggct 

1741 + + + + + + 1800 

ttgactctgagtctgtcctttcccttccggggtgtctgttccatcgggaccggtctccga 

a - N*DSDRKGKAPQTR* PWPEA 

b TETQTGKGRPHRQGS PGQRL 

C LRLRQEREG ptdkvalarg L- 

tgttttgtcttttggtttttatgaggtgggatatccctatgctgcctaggctgaccttga 

1801 + + + + + + i860 

acaaaacagaaaaccaaaaatactccaccctatagggatacgacggatccgactggaact 

a CFVFWFL*GGISLCCLG*P* 

b VLSFG FYEVGYPYAA * ADLE 

c FCLLVFMRWDIPMLPRLTLN- 

actcctgggctcaagcagtctacccacctcagcctcctgtgtagctgggattatagattg 
1861 + + + + + + 1920 

tgaggacccgagttcgtcagatgggtggagtcggaggacacatcgaccctaatatctaac 
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Figure 13-7 

a TPGLKQSTHLSLLCSWDYRL 

b LLG. SSSLPTSASCVAGIIDW- 

c SWAQAVYPP'QPPV*LGL*IG- 

gagccaccatgcccagctcagagggttgttctcctagactgaccctgatcagtctaagat 
1921 + + - + + + + 1980, 

ctcggtggtacgggtcgagtctcccaacaagaggatctgactgggactagtcagattcta 

a EPPCPAQRVVLLD*P*SV*D 

b SHHAQLRGLFS*TDPDQSKM- 

c ATMPSSEGCSPRLTLISLRW- 

gggtggggacgtcctgccacctggggcagtcacctgcccagatcccagaaggacctcctg 
1981 + + + + + + 2040 

cccacccctgcaggacggtggaccccgtcagtggacgggtctagggtcttcctggaggac 

a GWGRPATWGSHLPRSQKDLL 

b GGDVLPPGAVTCPDPRRT.S* 

C VGTSCHLGQSPAQI PEGPPE- 

agcgatgactcaagtgtctcagtccacctgagctgccatccagggatgccatctgtgggc 
2041 + + + + + + 2100 

tcgctactgagttcacagagtcaggtggactcgacggtaggtccctacggtagacacccg 

a SDDSSVSVHLSCHPGMPSVG 

b AMTQVSQST*AAI QGCHLWA- 

C R*LKCLSPPELPSRDAICGH- 

acgctgtgggcaggtgggagcttgattctcagcacttgggggatctgttgtgtacgtgga 
2101 + + + + + + 2160 

tgcgacacccgtccaccctcgaactaagagtcgtgaaccccctagacaacacatgcacct 

a TL WAGGS L I LSTWG I C C V? R G 

b RCGQVGA* FSALGGSVVYVE 

C AVGRWELDSQ. HLGDLLCTWR- 

gagggatgaggtgctgggagggatagaggggggctgcctggcccccagctgtgggtacag 
2161 + + + + + + 2220 

ctccctactccacgaccctccctatctccccccgacggaccgggggtcgacacccatgtc 

a EG*GAGRDRGGLPGPQLWVQ 

b RDEVLGGIEGGCLAPSCGYR- 

c GMRCWEG* RGAAWP PAVGTE- 
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Figure 13-8 

agaggtcaagcccaggaggactgccccgtgcagactggaggggacgctggtagagatgga 

2221 + + + + + + 2280 

tctccagttcgggtcctcctgacggggcacgtctgacctcccctgcgaccatctctacct 

a RGQAQEDCPVQTGGDAGRDG 

b EVKPRRTAPCRLEGTLVEME 

c RSSPGGLPRADWR GRW* RWR- 



ggaggaggcaattgggatggcactaggcatacaagtaggggttgtgggtgaccagttgca 

+ H b + + h 

cctcc:ccgctaaccctaccgtgatccgtatgttcatccccaacacccactggtcaacgt 



a GGGNWDGTRH-TSRGCG * PVA 

b EEAI GMALGIQVGVVGDQLH 

C RRQLGWH *AYK*GLWVTSCT- 



cttggcctctggattgtgggaattaaggaagtgactcatcctcttgaagatgctgaaaca 

I- h + + + + 

gaaccggagacctaacacccttaattccttcactgagtaggagaacttctacgactttgt 



a LGLWIVGI KEVTHPLEDAET 

b LASGLWELRK* LILLKMLKQ 

C WFLDCGN*GSDSSS*RC*NR 



ggagagaaaggggatgtatccatgggggcagggcatgactttgtcccatttctaaaggcc 
+ h i + ^ ^ 

cctctctttcccctacataggtacccccgtcccgtactgaaacagggtaaagatttccgg 



a "-" GEKGDVSMGAGHDFVPFLKA 

b ERKGMYPWGQGMTLSHF* RP 

c '~ RERGCIHGGRA * LC P I S KGL- 



tcttccttgctgtgtcataccaggccgccccagcctctgagcccctgggactgctgcttc 
+ + + + + + 

agaaggaacgacacagtatggtccggcggggtcggagactcggggaccctgacgacgaag 



a SSLLCHTRPPQPLSPWDCCF 

b LPCCVIPGRPSL*APGTAAS- 

C FLAVSYQAAPASEPLGLLLL- 



ttaaccccagtaagccactgccacacgtctgaccctctccaccccatagtgaccggctgc 
+ + + + + + 

aattggggtcattcggtgacggtgtgcagactgggagaggtggggtatcactggccgacg 



WO 98/45322 PCT/1B98/00705 

101/169 

Figure 13-9 

a LTPVSHCHTSDPLHPIVTGC 

b *PQ*ATATRLTLSTP* * P A A - 

c NPSKPLPHV*PSPPHSDRLL- 

ttttccctaagccaagggcctcttgcggtcccttcttactcacacacaaaatgtacccag 

2581 + + + + + + 2640 

aaaagggattcggttcccggagaacgccagggaagaatgagtgtgtgttttacatgggtc 

a FSLSQGPLAVPSYSHTKCTQ 

b FP*AKGLLRSL»LTHTQNVPS 

c FPKPRASCGPFLLTHKMYPV- 

tattctaggtagtgccctattttacaattgtaaaactgaggcacgagcaaagtgaagaca 

2641 + + + + + + 2700 

ataagatccaccacgggataaaatgttaacattttgactccgtgctcgtttcacttctgt 

a YSR* CPILiQL*N*GTSKVKT 

b ILGSALFYNCKTEARAK* R H - 

c F*VVPYFTIVKLRHEQS E D T - 

ctggctcatattcctgcagcctggaggccgggtgctcagggctgacacgtccaccccagt 

2701 + + + + + + 2760 

gaccgagtataaggacgtcggacctccggcccacgagtcccgactgtgcaggtggggtca 

a LA H I PAAWRPGAQG * HVH PS 

b WLIFLQPGGRVLRADTSTPV- 

c GSYSCSLEAGCSGLTRP PQC- 

gcacccactctgctttgactgagcagactggtgagcagactggtgggatctgtgcccaga 

2761 + + + + + + 2820 

cgtgggtgagacgaaactgactcgtctgaccactcgtctgaccaccctagacacgggtct 

a APTLL*LSRLVSRLVGSVPR 

b HPLCFD*ADW*ADWWDLCPE- 

c THSAL TEQTGEQTGGI CAQR- 

gatgggactgggagggcccacttcagggttctcctctcccctctaaggccgaagaagggt 

2821 + - + + . + + + 2880 

ctaccctgaccctcccgggtgaagtcccaagaggagaggggagattccggcttcttccca 

a DGTGRA HFRVLLSPLRPKKG - 

b MGLGGPTSGFSSPL*GRRRV- 
c WDWEGPLQGSPLPS KAEEGS- 
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Figure 13-10 

ccttccctctccccaagacttggtgtcctttccctccacttcttcctgccacctgctgct 

2881 + + + + + + 2940 

ggaagggagaggggttctgaaccacaggaaagggaggtgaagaaggacggtggacgacga 

a PSLSPRLGVL.SLHFFL.PPAA 

b LPSPQDLVSFPSTSSCHLLL- 

C FPLPKTWCPFPPLLPATCCC- 

gctgctgctgctaatcttcagggcactgctgctgcctttagtcgctgaggaaaaataaag 

2941 + + + + + + 3000 

cgacgacgacgattagaagtcccgtgacgacgacggaaatcagcgactcctttttatttc 

a AAAANLQGTAAAFSR * GKI K 

b LLLLIFRALLLPLVA.EEK*R- 

c CCC*SSGHCCCL*SLRKNKD- 

acaaatgctgcgccctt 

3001 + 3017 

tgtttacgacgcgggaa 

a T N A A P 

b Q M L R P 

c K C C A L - 

Enzymes that do cut : 

NONE 

Enzymes that do not cut : 
Not I 
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Figure 14-1 P53 

(Linear) MAP of: hsp53t check: 64 0 from: 1 to: 1760 

RL;HSP53T - Human p53 cellular tumor antigen mRNA, complete cds . 

ID HSP53T standard; RNA; HUM; 1760 BP. 

AC K03199; 
NI gl89478 

DT 18-NOV-1986 (Rel. 10, Created) 

DT 18- JAN- 1995 (Rel. 42, Last updated, Version 4) . . . 
With 1 enzymes : NOTI 

gtcgaccctttccacccctggaagatggaaataaacctgcgtgtgggtggagtgttagga 
1 - + + + + + + so 

cagctgggaaaggtggggaccttctacctttatttggacgcacacccacctcacaatcct 

a VDPFHPWKMEINLRVGGVLG 

b STLSTPGRWK*TCVWVEC*D- 

c RPFPPLE.DG. NKPAC6WSVRT- 

caaaaaaaaaaaaaaaaaagtctagagccaccgtccagggagcaggtagctgctgggctc 
61 + + + + + + 12Q 

gttttttttttttttttttcagatctcggtggcaggtccctcgtccatcgacgacccgag 

a QKKKKKSLEPPSREQVAAGL 

b KKKKKKV*SHRPGSR*LLGS- 

c KKKKKKSRA'TVQGAG S C WA P - 

cggggacactttgcgttcgggctgggagcgtgctttccacgacggtgacacgcttccctg 

121 + + + + - + + 180 

gcccctgtgaaacgcaagcccgaccctcgcacgaaaggtgctgccactgtgcgaagggac 

a RGHFAFGLGACFPRR * HAS L 

b G DTLRSGWERA FHDGD X: L P W - 

c GTLCVRAGSVLS T TVTR F PG- 

gattggcagccagactgccttccgggtcactgccatggaggagccgcagtcagatcctag 
181 + + + + + + 240 

ctaaccgtcggtctgacggaaggcccagtgacggtacctcctcggcgtcagtctaggatc 

a DWQPDCLP GHCHGGAAVRS * 

b IGSQTAFRVTAMEEPQSDPS 
C LAARLPSGSLPWRSRSQ I LA- 
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Figure 14-2 

cgtcgagccccctctgagtcaggaaacattttcagacctatggaaactacttcctgaaaa 

241 + + + + + + 300 

gcagctcgggggagactcagtcctttgtaaaagtctggacacctttgatgaaggactttt 

a RRAPSESGNI FRPMETTS * K 

b VEPPLSQETFSDLWK LLPEN- 

c SSPL*VRKHFQTYGNY FLKT- 

caacgttctgtcccccttgccgtcccaagcaatggatgatttgatgctgtccccggacga 

301 + + + + + + 360 

gttgcaagacagggggaacggcagggttcgttacctactaaactacgacaggggcctgct 

a QRSVPLAVPSNG* FDAVPGR 

b NVLSPLPSQAMDDLML.SPDD- 

c TFCPPCRPKQWMI*CCPRTI- 

tattgaacaatggttcactgaagacccaggtccagatgaagctcccagaatgccagaggc 
36! + + + + + + 4 2o 

ataacttgttaccaagtgacttctgggtccaggtctacttcgagggtcttacggtctccg 

a Y*TMVH*RPRSR*SSQNARG 

b IEQWFTEDPGPDEAPRMPEA- 

C IiNNGSLKTQVQMKL PECQRL- 

tgctccccccgtggcccctgcaccagcagctcctacaccggcggcccctgcaccagcccc 

421 + + + + + + 480 

acgaggggggcaccggggacgtggtcgtcgaggatgtggccgccggggacgtggtcgggg 

a CSPRGPCTSSSYTGGPCTSP 

b : APP VAPAPAAPTPAAPAPAP 

C LPPWPLHQQLLHRRPLHQPP- 

ctcctggcccctgtcatcttctgtcccttcccagaaaacctaccagggcagctacggttt 
481 + + + + + + 54 0 

gaggaccggggacagtagaagacagggaagggtcttttggatggtcccgtcgatgccaaa 

a LLAPVI FCPFPENLPGQLRF 

b SWPLSSSVPSQKTYQGSYGF- 

c PGPCHLLSLPRKPTRAATVS- 

ccgtctgggcttcttgcattctgggacagccaagtctgtgacttgcacgtactcccctgc 
541 + + + + + + goo 

gg ca gacccgaagaacgtaagaccctgtcggttcagacactgaacgtgcatgaggggacg 
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Figure 14-3 

a PSGLIiAFWDSQVCDLHVLPC 

b RLGFLHSGTAKSVTCTYSPA- 

C VWASCILGQPSL*LARTPLP- 

cctcaacaagatgttttgccaactggccaagacctgccctgtgcagctgtgggttgattc 

601 + + + + + + 660 

ggagttgttctacaaaacggttgaccggttctggacgggacacgtcgacacccaactaag 

a PQQDVLPTGQDLPCAAVG * F 

b LNKMFCQLAKTCPVQLWVDS- 

c STRCFANWPRPALCSCGLIP- 

cacacccccgcccggcacccgcgtccgcgccatggccatctacaagcagtcacagcacat 
661 + + + + + + 720 

gtgtgggggcgggccgtgggcgcaggcgcggtaccggtagatgttcgtcagtgtcgtgta 

a HTPARHPRPRHGHLQAVTAH 

b TPPPGTRVRAMAIYKQSQHM 

c HPRPAPASAPWPSTSSHST*- 

gacggaggttgtgaggcgctgcccccaccatgagcgctgctcagatagcgatggtctggc 

721 + + + + + + 780 

ctgcctccaacactccgcgacgggggtggtactcgcgacgagtctatcgctaccagaccg 

a DGGCEALPP P * ALIiR * RW S G 

b TEVV R RC PH HE R C S D S D G LA 

C RRL* GAAPTMSAAQ IAMVW P - 

ccctcctcagcatcttatccgagtggaaggaaatttgcgtgtggagtatttggatgacag 

781 + + + + + + 840 

gggaggagtcgtagaataggctcaccttcctttaaacgcacacctcataaacctactgtc 

a PSSASYPSGRKFACGV F G * Q 

b PPQ HLIRVEGNLRVEYLDDR- 

c LLSILSEWKEICVWSIWMTE- 

aaacacttttcgacatagtgtggtggtgccctatgagccgcctgaggttggctctgactg 

841 + + + + + + 900 

tttgtgaaaagctgtatcacaccaccacgggatactcggcggactccaaccgagactgac 

a KHFST* CGGAL*AA* GWLi * L 

b NTFRHSVVVPYEPPEVGSDC- 

c TLFDI VWWC PMSRLRLALTV- 
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Figure 14-4 

taccaccatccactacaactacatgtgtaacagttcctgcatgggcggcatgaaccggag 

901 + + + + + + 960 

atggtggtaggtgatgttgatgtacacattgtcaaggacgtacccgccgtacttggcctc 

a YHHPLQLHV*QFLHGRHEPE 

b TTIHYNYMCNSSCMGGMNRR- 

C PPSTTTTCVTVPAWAA * TGG- 

gcccatcct caeca tcatcacactggaagactccagtggtaatctactgggacggaacag 

961 + + + + + + 1020 

cgggtaggagtggtagtagtgtgaccttctgaggtcaccattagatgaccctgccttgtc 

a AHPHHHHTGRLQW* S TGTEQ 

b PILTI ITLEDSSGNLLGRN S 

c PSSPSSHWKTPVVIYWDGTA- 

ctttgaggtgcatgtttgtgcctgtcctgggagagaccggcgcacagaggaagagaatct 

1021 + + + + + + 1080 

gaaactccacgtacaaacacggacaggaccctctctggccgcgtgtctccttctcttaga 

a L*GACL»CLSWERPAHRGRES 
b FEVHVCACPGRDRRTEEENL 
C LRCM FVPVLGETGAQRKRIS- 

ccgcaagaaaggggagcctcaccacgagctgcccccagggagcactaagcgagcactgcc 

1081 + + + -+• -f- + 1140 

ggcgttctttcccctcggagtggtgctcgacgggggtccctcgtgattcgctcgtgacgg 

a PQERGASPRAAPREH * ASTA 

b — RKKGEPHHELPPGSTKR ALP 

c ARKGSLTTSCPQGALS EHC P- 

caacaacaccagctcctctccccagccaaagaagaaaccactggatggagaatatttcac 

1141 + + + + + + 1200 

gttgttgtggtcgaggagaggggtcggtttcttctttggtgacctacctcttataaagtg 

a Q QHQLLSPAKEETTGWRIFH 

b NNTSSSPQPKKKPLDGEYFT 
C TTPAPLPSQRRNHWMENISP- 

ccttcagatccgtgggcgtgagcgcttcgagatgttccgagagctgaatgaggccttgga 

1201 + + + + + + 1260 

ggaagtctaggcacccgcactcgcgaagctctacaaggctctcgacttactccggaacct 
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Figure 14-5 

a PSDPWA*ALRDVPRAE* G L G 

b LQIRGRERFEMFRELN'EALE 

c FRSVGVSASRCSES*MRPWN- 

actcaaggatgcccaggctgggaaggagccaggggggagcagggctcactccagccacct 
1261 + + + + + + 13 20 

tgagttcctacgggtccgacccttcctcggtcccccctcgtcccgagtgaggtcggtgga 

a TQGCPGWEGARGEQGSLQPP 

b LKDAQAGKEPGGSRAHS SHL 

c SRMPRLGRSQGGAGLTPAT * - 

gaagtccaaaaagggtcagtctacctcccgccataaaaaactcatgttcaagacagaagg 

1321 + + + + + + 1380 

cttcaggtttttcccagtcagatggagggcggtattttttgagtacaagttctgtcttcc 

a EVQKGSVYLPP* KTHVQDRR 

b KSKKGQSTSRHKKLMFKTEG- 

c SPKRVSLPPAIKNSCSRQKG- 

gcctgactcagactgacattctccacttcttgttccccactgacagcctcccacccccat 

1381 + + + + + + 1440 

cggactgagtctgactgtaagaggtgaagaacaaggggtgactgtcggagggtgggggta 

a A*LRLTFSTSCSPLTASHPH 

b PDSD*HSPLLVPH*QPPTPI- 

c LTQTDILHFLFPTDSLP PPS- 

ctctccctcccctgccattttgggttttgggtctttgaacccttgcttgcaataggtgtg 
1441 + + + + + + 1500 

gagagggaggggacggtaaaacccaaaacccagaaacttgggaacgaacgttatccacac 

a LS LPCHFGFWVFEPLLAIGV 

b SPSPAILGFGSLNPCLQ*VC- 

c LPPLPFWVLGL * TLACNRCA- 

cgtcagaagcacccaggacttccatttgctttgtcccggggctccactgaacaagttggc 
1501 + + + + + + 1560 

gcagtcttcgtgggtcctgaaggtaaacgaaacagggccccgaggtgacttgttcaaccg 

a RQKHPGLPFALSRGSTEQVG 

b VRSTQDFHLLCPGAPLNKLA- 

c SEAPRTSICFVPGLH*TSWP- 
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Figure 14-6 

ctgcactggtgttttgttgtggggaggaggatggggagtaggacataccagcttagattt 
1561 + + + + + ___ + 

gacgtgaccacaaaacaacacccctcctcctacccctcatcctgtatggtcgaatctaaa 



a LHWCFVVGRR MGSRTYQLRF 

b CTGVLLWGGGW .GVGHTS LD F 

C ALVFCCGEEDGE*DIPA*IL- 

taaggtttttactgtgagggatgtttgggagatgtaagaaatgttcttgcagttaagggt 
1621 + + + + + + 1680 

attccaaaaatgacactccctacaaaccctctacattctttacaagaacgtcaattccca 

a *GFYCEGCLrGDVRNVLAVKG 

b KVFTVRDVWEM* EMFLQLR V- 

c RFLL*GMFGRCKKCSCS * G L - 

tagtttacaatcagccacattctaggtagggacccacttcaccgtactaaccagggaagc 
1681 + + + + + + 1740 

atcaaatgttagtcggtgtaagatccatccctgggtgaagtggcatgattggtcccttcg 

a *FTI SHILGRDPLHRTNQGS 

b SLQSATF*VGTHFTVLTREA- 

c VYNQPHSR*GPTSPY* PGKL- 

tgtccctcactgttgaattc 
1741 + + 1760 

acagggagtgacaacttaag 

a CPSLLN 

b' V P H C * I 

c SLTVEF- 

Enzymes that do cut: 

NONE 

Enzymes that do not cut : 
Not I 
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Figure 15-1 BCL2 

(Linear) MAP of: hsbcl2a check: 1433 from: 1 to: 5086 

RL ; HSBCL2 A - Human B-cell leukemia/lymphoma 2 (bcl-2) proto- oncogene mRNA 
ID HSBCL2A standard; RNA; HUM; 5086 BP. 
AC M13 994; 
NI gl79366 

DT 19-SEP-1987 (Rel . 13, Created) 

DT 16-DEC-1994 (Rel. 42, Last updated, Version 3) . . . 
With 1 enzymes: NOTI 



gcgcccgcccctccgcgccgcctgcccgcccgcccgccgcgctcccgcccgccgctctcc 
I + + + + + + 50 

cgcgggcggggaggcgcggcggacgggcgggcgggcggcgcgagggcgggcggcgagagg 

a APAPPRRLPARPPRSRPPLS 

b RPPLRAACPPARRAPARRSP 

c ARPSAPPARPPAALPPAALR- 

gtggccccgccgcgctgccgccgccgccgctgccagcgaaggtgccggggctccgggccc 
61 + + + + + + 120 

caccggggcggcgcgacggcggcggcggcgacggtcgcttccacggccccgaggcccggg 

a VAPPRCRRRRCQ RRCRGSGP 

b W P RRAAAAAAAS E GA G A P G P 

c GPAALP P PPLPAKVP GLRAL- 

tccctgccggcggccgtcagcgctcggagcgaactgcgcgacgggaggtccgggaggcga 

121 + + + + + -+ 180 

agggacggccgccggcagtcgcgagcctcgcttgacgcgctgccctccaggccctccgct 

a S L PAAVSARSELRDGRS G RR 

b PCRRPSALGANCATGGPGGD- 

c PAGGRQRS ERTARREVREAT- 

ccgtagtcgcgccgccgcgcaggaccaggaggaggagaaagggtgcgcagcccggaggcg 

181 + + + + + +24 0 

ggcatcagcgcggcggcgcgtcctggtcctcctcctctttcccacgcgtcgggcctccgc 

a P*SRRRAGPGGGERVRS PEA 

b RSRAAAQDQEEEKGCAARRR 

c VVAPPRRTRRRRKGAQ PGGG- 
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Figure 15-2 

gggtgcgccggtggggtgcagcggaagagggggtccaggggggagaacttcgtagcagtc 

241 + + + + + + 300 

cccacgcggccaccccacgtcgccttctcccccaggtcccccctcttgaagcatcgtcag 

a GCAGGVQRKRGSRG ENFVAV 

b GAPVGCSGRGGPGGRTS * Q S 

c VRRWGAAEEGVQGGELRSSH- 

atcctttttaggaaaagagggaaaaaataaaaccctcccccaccacctccttctccccac 

301 + + - + + + + 360 

taggaaaaacccttttctcccttttttattttgggagggggtggtggaggaagaggggtg 

a IL FR KRGKK*NPPPPPPSPH 

b SFLGKEGKNKTLPHHLLLP T- 

c PF*EKREKIKPSPTTSFSPP- 

ccctcgccgcaccacacacagcgcgggcttctagcgctcggcaccggcgggccaggcgcg 

361 + + + + + + 420 

gggagcggcgtggtgtgtgtcgcgcccgaagat cgcgagccgtggccgcccggtccgcgc 

a PSPHHTQRGLLALGTGGPGA 

b PRRTTHSAG F*RS APAGQAR 

C LAAPHTARAS SARHRRARRV- 

tcctgccttcatttatccagcagcttttcggaaaatgcatttgctgttcggagtttaatc 

421 + + + + + + 480 

aggacggaagtaaataggtcgtcgaaaagccttttacgtaaacgacaagcctcaaattag 

a v SCLHLSSSFSENAFAVRSLI 

b - PAFIYPAAFRKMHLLFGV* S 

c - LPSFIQQLFGKCICCSEFNQ- 

agaagacgattcctgcctccgtccccggctccttcatcgtcccatctcccctgtctctct 

481 + + + + + + 540 

tcttctgctaaggacggaggcaggggccgaggaagtagcagggtagaggggacagagaga 

a RRRFLPPSPAPSSSH LPCLS 

b EDDSCLRPRLLHRPI SPVSL- 

c KTIPASVPGSFIVPSPLSLS- 

cctggggaggcgtgaagcggtcccgtggatagagattcatgcctgtgtccgcgcgtgtgt 

541 + + + + + + 600 

ggacccctccgcacttcgccagggcacctatctctaagtacggacacaggcgcgcacaca 
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Figure 15-3 

a PG EA*SGP VDRDSCLCPRVC 

b lgrreavpwieihacv'racv- 

c WGGVKRSRG *RFMPVSARVC- 

gcgcgcgtataaattgccgagaaggggaaaacatcacaggacttctgcgaataccggact 
601 + + + + + + 66Q 

cgcgcgcatatttaacggctcttccccttttgtagtgtcctgaagacgcttatggcctga 

a ARV* IAEKGKTSQDFCEYRT 

b RAYKLPRRGKHHRTSANTGL 

c ARINCREGENITGLLRI P D * - 

gaaaattgtaattcatctgccgccgccgctgccaaaaaaaaactcgagctcttgagatct 
661 + + + + + + 720 

cttttaacattaagtagacggcggcggcgacggtttttttttgagctcgagaactctaga 

a ENCNSSAAAAAKKKLELLRS 

b KIVIHLPPPLPKKNSSS*DL- 

c K L * FICRRRCQKKTRALEI S - 

ccggttgggattcctgcggattgacatttctgtgaagcagaagtctgggaatcgatctgg 

721 + + + + + + 780 

ggccaaccctaaggacgcctaactgtaaagacacttcgtcttcagacccttagctagacc 

a PVGIPAD*HFCEAEVWESIW 

b RLGFLRIDI SVKQKSGNRS G. - 

c GWDSCGLTFL*SRSIjGIDL. E- 

aaatcctcctaatttttactccctctccccccgactcctgattcattgggaagtttcaaa 
7B1 + + + + + + 840 

tttaggaggattaaaaatgagggagaggggggctgaggactaagtaacccttcaaagttt 

a KSS*FLLPLPPTPDSLGS FK 

b NPPNFYSLS PRLLIHWEVSN- 

c ILLIFTPSPPDS*FIGKFQI- 

tcagctataactggagagtgctgaagattgatgggatcgttgccttatgcatttgttttg 
841 + + + + + + 900 

agtcgatattgacctctcacgacttctaactaccctagcaacggaatacgtaaacaaaac 

a SAITGEC * RLMGSLPYAFVL 

b QI»*LESAED*WDRCLMHLFW- 

c SYNWRVLKI DGIVALC I CFG- 
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Figure 15-4 

gttttacaaaaaggaaacttgacagaggatcatgctgtacttaaaaaatacaagtaagtc 
901 + + + + + + 

caaaatgtttttcctttgaactgtctcctagtacgacatgaattttttatgttcattcag 



a VLQKGNLTEDHAVLKKYK* V 

b FYKKET*QRIMLYLKNTSKS 

c FTKRKLDRGSCCT* KIQVSL- 



tcgcacaggaaattggtttaatgtaactttcaatggaaacctttgagattttttacttaa 

I h H 1 h h 

agcgtctcctttaaccaaattacattgaaagttacctttggaaactctaaaaaatgaatt 



a SHRKLV* CNFQWKPLRFFT * 

b RTGNWFNVTFNGNL* DFLLK 

c AQEIGLM*LSMETFEIFYLK- 

agtgcaztcgagtaaatttaatttccaggcagcttaatacattgtttttagccgtgttac 
1021 + + + + + + 1080 

tcacgcaagctcatttaaattaaaggtccgtcgaattatgtaacaaaaatcggcacaatg 

a SAFE*I*FPGSLIHCF*PC'Y 

b VHSSKFNFQAA*YI VFSRVT 

c CIRVNLISRQLNTLFLAVLL- 



ttgtaccgtgtatgccctgctttcactcagtgtgtacagggaaacgcacctgatttttta 

— — — — — — — — — + — — — — —— — — — -|-— — _______ — ________ .|_ _ — _____ _ _ _ _. _ — — .—.— J. 

aacatcacacatacgggacgaaagtgagtcacacatgtccctttgcgtggactaaaaaat 



a Ij*CVCPAFTQCVQGNAPDFL» 

b CSVYALLS-LSVYRETHLI FY 

C VVCMPCFHSVCTGKR T* FFT- 

cttattagtttgttttttctttaacctttcagcatcacagaggaagtagactgatattaa 

1141 + + + + + + 1200 

gaataatcaaacaaaaaagaaattggaaagtcgtagtgtctccttcatctgactataatt 

a L»ISLFFL*PFSITEEVD*Y* 

b LLVCFFFNLSASQRK*TDIN 

C Y * FVFSLTFQHHRGS RL I LT- 

caatacttactaataataacgtgcctcatgaaataaagatccgaaaggaattggaataaa 
1201 + + + + + + 126Q 

gttatgaatgattattattgcacggagtactttatttctaggctttccttaaccttattt 
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Figure 15-5 

a QYLLI ITCLMK*RSERNVJNK 

b N T Y * * *RAS*NKDPKGIGIK- 

c ILTNNNVPHEI KI RKEIiE * K - 

aatttcctgcgtctcatgccaagagggaaacaccagaatcaagtgttccgcgtgattgaa 

1261 + + + + + + 1320 

ttaaaggacgcagagtacggttctccctttgtggtcttagttcacaaggcgcactaactt 

a NFLRLMPRGKHQNQVFRVI E 

b ISCVSCQEGNTRIKCSA*LK- 

c FPASHAKRETPES SVP RD * R - 

gacaccccctcgtccaagaatgcaaagcacatccaataaaatagctggattataactcct 

1321 + + + + + < + 1380 

ctgtgggggagcaggttcttacgtttcgtgtaggttattttatcgacccaatattgagga 

a DTPSSKNAKHIQ*NSWIITP 

b TPPRPRMQSTSNKIAGL»* L L» - 

c HPLVQECKAHPIK* LiDYNSS- 

cttctttctctgggggccgtggggtgggagctggggcgagaggtgccgttggcccccgtt 

1381 + + + + + + 1440 

gaagaaagagacccccggcaccccaccctcgaccccgctctccacggcaaccgggggcaa 

a LtLSLGAVGWELGREVPLiAPV 

b FFLWGPWGGSWGERCRWPPL- 

C S FSGGRGVGAGARGAVG PRC- 

gcttttcctctgggaaggatggcgcacgctgggagaacggggtacgacaaccgggagata 

1441 + + + + + + 1500 

cgaaaaggagacccttcctaccgcgtgcgaccctcttgccccatgctgttggccctctat 

a AFPLGRMAHAGRTGYDN REI 

b LFLWEGWRTLGERGTT T GR * - 

c FSSGKDGARWENGVRQP GDS- 

gtgatgaagtacatccattataagctgtcgcagaggggctacgagtgggatgcgggagat 

1501 + - + + - + + + 1560 

cactacttcatgtaggtaatattcgacagcgtctccccgatgctcaccctacgccctcta 

a VMKYIHYKLSQRGYEWDAGD 

b * * S T S I I SCRRGATSGMREM- 

c DEVHPL* AVAEGLRVGCGRC- 
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Figure 15-6 

gtgggcgccgcgcccccgggggccgcccccgcaccgggcatcttctcctcccagcccggg 

1561 + + + + + + 1620 

cacccgcggcgcgggggcccccggcgggggcgtggcccgtagaagaggagggtcgggccc 

a VGAAPPGAAPAPGI FSSQPG 

b WAPRPRGPPPHRASSPPSPG- 

c GRRAPGGRPRTGHLLLPARA- 

cacacgccccatccagccgcatcccgcgacccggtcgccaggacctcgccgctgcagacc 

1621 + + + + + + 1680 

gtgtgcggggtaggtcggcgtagggcgctgggccagcggtcctggagcggcgacgtctgg 

a HTPHPAASRDPVARTS PLQT 

b TRPIQPHPATRSPGPRRCRP 

c HAPS SRI PRPGRQDLAAAD P - 

ccggctgcccccggcgccgccgcggggcctgcgctcagcccggtgccacctgtggtccac 

1681 + + + + + + 1740 

ggccgacgggggccgcggcggcgccccggacgegagtcgggccacggtggacaccaggtg 

a PAAPGAAAGPALSPVPPVVH 

b RLPPAPPRGLRSARCHLWST- 

c GCPRRRRGACAQPGATCG P P - 

ctggccctccgccaagccggcgacgacttctcccgccgctaccgcggcgacttcgccgag 

1741 + + + + + + 1800 

gaccgggaggcggttcggccgctgctgaagagggcggcgatggcgccgctgaagcggctc 

a LALRQAGDDFSRRYRGDFAE 

b WPSAKPATTSPAATAATS PR 

c GPPP SRRRLLPPLPRRLRRD- 

atgtccagccagctgcacctgacgcccttcaccgcgcggggacgctttgccacggtggtg 

1801 + + + + + + 1860 

tacaggtcggtcgacgtggactgcgggaagtggcgcgcccctgcgaaacggtgccaccac 

a MSSQLHLTPFTARGRFATVV 

b CPASCT*RPSPRGDALPRWW- 

c VQPAAPDALHRAGTLCHGGG- 

gaggagctcttcagggacggggtgaactgggggaggattgtggccttctttgagttcggt 

1861 + + + + + + 1920 

ctcctcgagaagtccctgccccacttgaccccctcctaacaccggaagaaactcaagcca 
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Figure 15-7 

a EELFRDGVNWGRIVAFFEFG 

b RSSSGTG*TGGGLWPSLSSV- 

c GAL-Q-GRGELGEDCGLL*VRW- 

ggggtcatgtgtgtggagagcgtcaaccgggagatgtcgcccctggtggacaacatcgcc 

1921 + + + + + + 1980 

ccccagtacacacacctctcgcagttggccctctacagcggggaccacctgttgtagcgg 

a GVMCVESVNREMSPLVDNIA 

b GSCVWRASTGRCRPWWTTSP- 

c GHVCGERQPGDVAPGGQHRP- 

ctgtggatgactgagtacctgaaccggcacctgcacacctggatccaggataacggaggc 

1981 + + + + + + 2040 

gacacctactgactcacggacttggccgtggacgtgtggacccaggtcctattigcctccg 

a LWMTEYLNRHLHTWI QDNGG 

b CG * L ST * TGTCTPG SR I TEA 

c VDD*VPEPAPAHLDPG*RRL- 

tgggatgcctttgtggaactgtacggccccagcatgcggcctctgtttgatttctcctgg 

2041 + + + + + + 2100 

accctacggaaacaccttgacatgccggggtcgtacgccggagacaaactaaagaggacc 

a WDAFVELYGPSMRPL FDFSW 

b GMPLWNCTAPACGLCLISPG- 

c GCLCGTVRPQHAASV* FLLA- 

ctgtctctgaagactctgctcagtttggccctggtgggagcttgcatcaccctgggtgcc 

2101 + — + + + + + 2160 

gacagagacttctgagacgagtcaaaccgggaccaccctcgaacgtagtgggacccacgg 

a LSLKTLLS LALVGAC I T LGA 

b C L*RLCSVWPWWELASPWVP- 

c VSEDSAQFGPGGSLHHPGCL- 

tatctgagccacaagtgaagtcaacatgcctgccccaaacaaatatgcaaaaggttcact 

2161 + + - + + + + 2220 

atagactcggtgttcacttcagttgtacggacggggtttgtttatacgttttccaagtga 

a YLSHK*SQHACPKQI CKRFT 

b I *ATSEVNMPAPNKYAKGSL- 

C SEPQVKSTCLPQTNMQKVH *- 



Figure 15-8 

aaagcagtagaaataatatgcattgtcagtgatgtaccatgaaacaaagctgcaggctgt 
2221 + + + + + + 2280 

tttcgtcatctttattatacgtaacagtcactacatggtactttgtttcgacgtccgaca 

a KAVEIICIVSDVP*NKAAGC 

b K Q * K* YALSVMYHETKLQAV 

c SSRNNMHCQ * CTMKQS CRL F - 

ttaagaaaaaataacacacatataaacatcacacacacagacagacacacacacacacaa 
2281 + + + + + -u 2340 

aattcttttttattgtgtgtatatttgtagtgtgtgtgtctgtctgtgtgtgtgtgtgtt 

a LRKNNTH IN I THTDRHTHTQ 

b *EKITHI*TSHTQTDTHTH N- 

c KKK*HTYKHHTHRQTHTHTT- 

caattaacagtcttcaggcaaaacgtcgaatcagctatttactgccaaagggaaatatca 

2341 + + + + + + 2400 

gttaattgtcagaagtccgttttgcagcttagtcgataaatgacggtttccctttatagt 

a QLTVFRQNVESAIYCQREIS 

b N*QSSGKTSNQLFTAKGKYH - 

c I NSLQAKRR I SYLLPKGNI I - 

tttattttttacattattaagaaaaaagatttatttatttaagacagtcccatcaaaact 
2401 + + + + + + 2460 

aaataaaaaatgtaataattcttttttctaaataaataaattctgtcagggtagttttga 

a ' FIFYIIKKKDLFI*DSPIKT 

b LFFTLLRKKI YLFKTVP S KL - 

c YFLHY * E KR F I YLRQS HQN S - 

ccgtctttggaaatccgaccactaattgccaaacaccgcttcgtgtggctccacctggat: 

2461 + + + + + + 2520 

ggcagaaacctttaggctggtgattaacggtttgtggcgaagcacaccgaggtggaccta 

a PSLEIRPLIAKHRFVWLHLD 

b RLWKSDH * LPNTASCGSTWM 

C VFGNPTTNCQTPLRVAP PGC- 

gttctgtgcctgtaaacatagattcgctttccatgttgttggccggatcaccatctgaag 
2521 + + + + + + 2580 

caagacacggacatttgtatctaagcgaaaggtacaacaaccggcctagtggtagacttc 
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Figure 15-9 



a VLCL*T*IRFPCCWPDHHL,K 

b FCACKHRFAFHVVGR I T I * R - 

c SVPVNIDSLSMLLAGSPSEE- 

agcagacggatggaaaaaggacctgatcattggggaagctggctttctggctgctggagg 

2581 + + + + + + 2640 

tcgtctgcctacctttttcctggactagtaaccccttcgaccgaaagaccgacgacctcc 

a SRRMEKGPDHWGSWLSGCWR 

b ADGWKKDLI IGEAGFLAAGG- 

C QTDGKRT * SLGKLAFWLLEA- 

ctggggagaaggtgttcatt.cacttgcatttctttgccctgggggcgt:gatat:taacaga 

2641 + + + + + + 2700 

gacccctcttccacaagtaagtgaacgtaaagaaacgggacccccgcactataattgtct 

a LGRRCSFTCISLPWGRDINR 

b WGEG VHSLA FLCPGGVI LTE- 

c GEKVFIHLHFFALGA*-Y*QR- 

gggagggttcccgtggggggaagtccatgcctccctggcctgaagaagagactctttgca 

2701 + - + + + + + 2760 

ccctcccaagggcaccccccttcaggtacggagggaccggacttcttctctgagaaacgt 

a GRVPVGGS PCLPGLK KRL FA 

b GGFPWGEVHASLA*RRDSLH 

C EGSRGGKSMPPWPEEETLCI- 

tatgacccacatgatgcatacctggtgggaggaaaagagttgggaacttcagatggacct 

2761 + + + + + + 2820 

atactgagtgtactacgtatggaccaccctccttttctcaacccttgaagtctacctgga 

a YD.SHDAYLVGGKELGTS : D G P 

b MTHMMHTWWEEKSWELQ'MDL 

c * L T * CI PGGRKRVGNFRWT*- 

agtacccactgagatttccacgccgaaggacagcgatgggaaaaatgcccttaaatcata 

2821 + + + + + + 2880 

tcatgggtgactctaaaggtgcggcttcctgtcgctaccctttttacgggaatttagtat: 

a STH*DFHAEGQRWEKCP* I I 

b VPTEISTPKDSDGKNALKS* 

C YPLRFPRRRTAMGKMPLNHR- 
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Figure 15-10 

ggaaagtatttttttaagctaccaattgtgccgagaaaagcattttagcaatttatacaa 



2881 + + + + + + 2940 

cctttcataaaaaaattcgatggttaacacggctcttttcgtaaaatcgttaaatatgtt 

a GKYFFKLPIVPRKAF*QFIQ 

b ESIFLSYQLCREKHFS NLYN- 

C KVFF*ATNCAEKS I L» A I Y T I - 

tatcatccagtaccttaaaccctgattgtgtatattcatatattttggatacgcaccccc 

2941 + + + + + + 3000 

atagtaggtcatggaatttgggactaacacatataagtatataaaacctatgcgtggggg 

a YHPVP*TLIVYIHIFWIRTP 

b I IQYLKP*LCIFIYFGYAPP- 

c SSSTLNPDCVYSYI LDTHPP- 

caactcccaatactggctctgtctgagtaagaaacagaatcctctggaacttgaggaagt 

3001 + + + + + + 3060 

gttgagggttatgaccgagacagactcattctttgtcttaggagaccttgaactccttca 

a QLPILALSE* ETESSGT*GS 

b NSQYWLCLS KKQNPLELEEV- 

C. TPNTGSV*VRNRI LWNLRK*- 

gaacatttcggtgacttccgatcaggaaggctagagttacccagagcatcaggccgccac 

3061 + + + + + + 3120 

cttgtaaagccactgaaggctagtccttccgatctcaatgggtctcgtagtccggcggtg 

a- EHFGDFRSGRLELPRASGRH 

b N I SVTSDQEG* SYPEHQAAT 

C TFR* LPIRKARVTQS I RPPQ- 

aagtgcctgcttttaggagaccgaagtccgcagaacctacctgtgtcccagcttggaggc 

3121 + + + + + + 3180 

ttcacggacgaaaatcctctggcttcaggcgtcttggatggacacagggtcgaacctccg 

a KCLLLGDRS PQNLPVSQLGG 

b SACF*ETEVRRTYL»CPSLEA- 

c VPAFRRPKSAEPTCVPAWRP- 

ctggtcctggaactgagccgggccctcactggcctcctccagggatgatcaacagggtag 

3181 + + + + + + 3240 

gaccaggaccttgactcggcccgggagtgaccggaggaggtccctactagttgtcccatc 
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Figure 15-11 

a LVLELSRALTGLLQG* STG * 

b WSWN*AGPSLASSRDDQQGS- 

c GPGTEPGPHWPPPGMINRVV- 

tgtggtctccgaatgtctggaagctgatggatggagctcagaattccactgtcaagaaag 

3241 + + + + + + 3300 

acaccagaggcttacagaccttcgactacctacctcgagtcttaaggtgacagttctttc 

a CGLiRMSGS * W M E L R I PLSRK 

b VVSECLEADGWS SEFHCQER 

c WSPNVWKLMDGAQNSTVKKE- 

agcagtagaggggtgtggctgggcctgtcaccctggggccctccaggtaggcccgttttc 

3301 + + + + + + 3360 

tcgtcatctccccacaccgacccggacagtgggaccccgggaggtccatccgggcaaaag 

a SSRGVWLGLSPWGPPGRPVF 

b AVEGCGWACHPGALQVGPFS- 

C Q*RGVAGPVTLGPSR*ARFH- 

acgtggagcataggagccacgacccttcttaagacatgtatcactgtagagggaaggaac 

3361 + + + + + + 3420 

tgcacctcgtatcctcggtgctgggaagaattctgtacatagtgacatctcccttccttg 

a TWSIGATTLLKTC'l TVEGRN 

b RGA*EPRPFLRHVSL*REGT- 

c VEHRSHDPS * DMYHCRGKEQ- 

agaggccctgggccttcctiatcagaaggacatggtgaaggctgggaacgtgaggagaggc 

3421 + + + + + + 3480 

tctccgggacccggaaggatagtcttcctgtaccacttccgacccttgcactcctctccg 

a RG PGPSYQKDMVKAGNV R RG 

b EALGLP I RRTW * RLGT * G E A 

c RPWAFLS EGHG EG WERE ERQ- 

aatggccacggcccattttggctgtagcacatggcacgttggctgtgtggccttggccac 

3481 + + + + + + 3540 

ttaccggtgccgggtaaaaccgacatcgtgtaccgtgcaaccgacacaccggaaccggtg 

a NGHGPFWL*HMARWLCGLGH 

b 1 MATAHFGCSTWHVGCVAIiAT 

C WPRPILAVAHGTLAVWPWPP- 
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Figure 15-12 

ctgtgagtttaaagcaaggctttaaatgactttggagagggtcacaaatcctaaaagaag 

3541 + + + + + + 3600 

gacactcaaatttcgttccgaaatttactgaaacctctcccagtgtttaggattttcttc 

a L*V*SKALNDFGEGHKS*KK 

b CEFKARL*MTLERVTNPKRS- 

C VSLKQGFK* LWRGSQI L K E A - 

cattgaagtgaggtgtcatggattaattgacccctgtctatggaattacatgtaaaacat 

3601 + + + + + + 3660 

gtaacttcactccacagtacctaattaactggggacagataccttaatgtacattttgta 

a H*SEVSWIN*PLSMELHVKH 
b I EV'RCHGLI DPCLWNYM * N I 

c LK*GVMD* LTPVYGI TCKTL- 

tatcttgtcactgtagtttggttttatttgaaaacctgacaaaaaaaaagttccaggtgt 

3661 + + + + + + 3720 

atagaacagtgacatcaaaccaaaataaactttcggactgtttttttttcaaggtccaca 

a YLVTVVWFYLKT*QKKS SRC 

b ILSL*FGFI*KPDKKKVPGV- 

c SCHCSLVLFENLTKKKFQVW- 

ggaatatgggggttatctgtacatcctggggcatt-aaaaaaaaatcaatggtggggaact 

3721 + + + - + + + 3780 

ccttatacccccaatagacatgtaggaccccgtaatttttttttagttaccaccccttga 

a GIWGLSVH PGALKKNQ WWGT 

b EYGGYLYI LGH * KK I NG G E L - 

c NMGVI CTSWGI KK KS MVGNY- 

ataaagaagtaacaaaagaagtgacatcttcagcaaataaactaggaaatttttttttct 

3781 + + + + + + 3840 

tatttcttcattgttttcttcactgtagaagtcgtttatttgatcctttaaaaaaaaaga 

a IKK*QKK*HL»QQIN* E I FFS 

b *RSNKRSDIFSK*TRKFFFLi- 

c KEVTKEVTS SANKLGN F F F F - 

tccagtttagaatcagccttgaaacattgatggaataactctgtggcattattgcattat 

3841 + + + - + + + 3900 

aggtcaaatcttagtcggaactttgtaactaccttattgagacaccgtaataacgtaata 
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Figure 15-13 



a SSLESALKH*WNNSVALLHY 

b PV*NQP*NIDGITLWHYCII - 

C QFRI SLETLjME*LCGI IALY- 

ataccatttatctgtattaactttggaatgtactctgttcaatgtttaatgctgtggttg 

3901 + + + + + + 3960 

tatggtaaatagacataattgaaaccttacatgagacaagttacaaattacgacaccaac 

a IPFICINFGMYSVQCLMLWL 

b YHLSVLTLECTLiFNV*CCG* 

c TIYLY* LWNVLCSMFNAVVD- 

atatttcgaaagctgctttaaaaaaatacatgcatctcagcgtttttttgtttttaattg 

3 961 + + + + + + 4 020 

tataaagctttcgacgaaatttttttatgtacgtagagtcgcaaaaaaacaaaaattaac 

a IFRKLL* KNTCI SAFFCF * L 

b YFESCFKKIHASQRFFVFNC- 

c ISKAALKKYMHLSVFLFLIV- 

tatttagttatggcctatacactatttgtgagcaaaggtgatcgttttctgtttgagatt 

4021 + + + + + + 4080 

ataaatcaataccggatatgtgataaacactcgtttccactagcaaaagacaaactctaa 

a YLVMAYTLFVSKGDRFLFEI 

b I*LWPIHYL*AKVIVFCLRF- 

c FSYGLYTICEQR* SFSV * DF- 

tttatctcttgattcttcaaaagcattctgagaaggtgagataagccctgagtctcagct 

4081 + + + + + + 4140 

aaatagagaactaagaagttttcgtaagactcttccactctattcgggactcagagtcga 

a FI S*FFKSILRR*DKP*V' SA 

b LSLDSSKAF*EGEISPESQL- 

C YLLI LQKHSEKVR * ALS L SY- 

acctaagaaaaacctggatgtcactggccactgaggagctttgtttcaaccaagtcatgt 

4141 + + + + + + 4200 

tggat.tctttttggacctacagtgaccggtgactcctcgaaacaaagttggttcagtaca 

a T*EKPGCHWPLRSFVSTKSC 

b PKKNLDVTGH*GAL»FQP SHV 

c LRKTWMS LATEELC FNQVMC- 
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Figure 15-14 

gcatttccacgtcaacagaattgtttattgtgacagttatatctgttgtccctttgacct 

4201 + + -■ + + + + 4260 

cgtaaaggtgcagttgtcttaacaaataacactgtcaatatagacaacagggaaactgga 

a AFPRQQN CL L* QLYLLSL* P 

b H*FHVNRIV*YCDSY I CCPFDL 

c ISTSTELFIVTVISVVPLTL- 

tgtttcttgaaggtttcctcgtccctgggcaattccgcatttaattcatggtattcagga 

4261 + + + + + + 4320 

acaaagaacttccaaaggagcagggacccgttaaggcgtaaattaagtaccataagtcct 

a CFLKVSSSLGNSAFNSWYSG 

b VS*RFPRPWAIPHLIHGIQD- 

C FLEGFLVPGQFRI * FMVFR I - 

ttacatgcatgtttggttaaacccatgagattcattcagttaaaaatccagatggcgaat 

4321 + + + + + + 4380 

aatgtacgtacaaaccaatttgggtactctaagtaagtcaatttttaggtctaccgctta 

a LHACLVKPMRFIQLKIQ 'MAN 

b Y M H V W LNP * DS FS * KS'RWRM 

c TCMFG * THEIHSVKNPDGE * - 

gaccagcagattcaaatctatggtggtttgacctttagagagttgctttacgtggcctgt 

4381 + + + + + + 4440 

ctggtcgtctaagtttagataccaccaaactggaaatctctcaacgaaatgcaccggaca 

a : DQQIQIYGGLTFRELLYVAC 

b TSRFKSMVV* PLESCFTWP V- 

c PADSNLWWFDL* RVALRGLF- 

ttcaacacagacccacccagagccctcctgccctccttccgcgggggctttctcatggct 

4441 + + + + + + 4500 

aagttgtgtctgggtgggtctcgggaggacgggaggaaggcgcccccgaaagagtaccga 

a FNTDPPRALLPSFRGGFLMA 

b STQTHPEPSCPPSAGAFSWL- 

c QHRPTQS PPALLPRGLSHGC- 

gtccttcagggtcttcctgaaatgcagtggtcgctacgctccaccaagaaagcaggaaac 

4501 + + + + + + 4560 

caggaagtcccagaaggactttacgtcaccagcaatgcgaggtggttctttcgtcctttg 
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Figure 15-15 

a VLQGLPEMQWSLRSTKKAGN 

b SFRVFLKCSGRYAPPRKQET- 

c PSGSS*NAVVV*TL*HQESRKP- 

ctgtggtatgaagccagacctccccggcgggcctcagggaacagaatgatcagacctttg 

4561 + + + + + + 4620 

gacaccatacttcggtctggaggggccgcccggagtcccttgtcttactagtctggaaac 

a LWYEARPPRRASGNRMIRPL 

b CGMKPDLPGGPQGTE* SDL* 

c V V * SQTSPAGLREQNDQTFE- 

aatgattctaatttttaagcaaaatattattttatgaaaggtttacattgtcaaagtgat 

4621 + + + + + + 4680 

ttactaagattaaaaattcgttttataataaaatactttccaaatgtaacagtttcacta 

a NDSNF*AKYYFMKGLHCQSD 

b MILIFKQNI IL*KVYIVKVM 

C * F* FLSKILFYERFTLS K* * - 

gaatatggaatatccaatcctgtgctgctatcctgccaaaatcattttaatggagtcagt 

4681 + + + + + + 4740 

cttataccttataggttaggacacgacgataggacggttttagtaaaattacctcagtca 

a EYGISNPVLLSCQNHFNGVS 

b NMEYPILCCYPAKI ILMESV- 

C IWNIQSCAAILPKSF*WSQF- 

ttgcagtatgctccacgtggtaagatcctccaagctgctttagaagtaacaatgaagaac 

4741 + + -i- + -r -r 4800 

aacgtcatacgaggtgcaccattctaggaggttcgacgaaatcttcattgttacttcttg 

a LQYAPRGKILQAALEVTMKN 

b CSMLHVVRSSKLL* K* Q* RT 

c AVCSTW*DPPSCFRSNNEER- 

gtggacgtttttaatataaagcctgttttgtcttttgttgttgttcaaacgggattcaca 

4801 + + + + + + 4860 

cacctgcaaaaattatatttcggacaaaacagaaaacaacaacaagtttgccctaagtgt 

a VDVFNI KPVLS FVVVQTGFT 

b WTFLI *SLFCLLLLFKRDSQ 

C GRF*YKACFVFCCCSNG I HR- 
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Figure 15-16 

gagtatttgaaaaatgtatatatattaagaggtcacgggggctaattgctagctggctgc 

4861 + + + + + + 4920 

ctcataaactttttacatatatataattctccagtgcccccgattaacgatcgaccgacg 

a EYLKNVYI LRGHGG* LLAGC 

b SI * K M Y I Y * EVTGANC* LAA 

C VFEKCIYI KRSRGLIASWLP- 

cttttgctgtggggttttgttacctggttttaataacagtaaatgtgcccagcctcttgg 
4921 + + + + + + 49 8 o 

gaaaacgacaccccaaaacaatggaccaaaattattgtcatttacacgggtcggagaacc 

a LLLWGFVTWF**Q*MCPASW 

b FCCGVLLPGFNNSKCAQPLG- 

c FAVGFCYLVL I TVNVP S L LA- 

ccccagaactgtacagtattgtggctgcacttgctctaagagtagttgatgttgcatttt 
4981 + + + + + + 5Q40 

ggggtcttgacatgtcataacaccgacgtgaacgagattctcatcaactacaacgtaaaa 

a PQNCTVLWLHLL*E* LMLHF 

b PRTVQYCGCTCSKSS*CCIF- 

c PELYS I VAALALRVVDVA F S - 

ccttattgttaaaaacatgttagaagcaatgaatgtatataaaagc 
5041 + + + + . 5086 

ggaataacaatttttgtacaatcttcgttacttacatatattttcg 

a PYC*KHVRSNECI*K 
b " LIVKNMLEAMNVYKS 
c LLL KT C * KQ * MY I K 

Enzymes that do cut : 

NONE 

Enzymes that do not cut: 
Not I 
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Figure 16-1 Seznaphorin III 

(Linear) MAP of: hshsem check: 7721 from: 1 to: 2530 

RL ; HSHSEM - Homo sapiens semaphorin-III (Hsema-I) mRNA, complete cds . 

ID HSHSEM standard; RNA; HUM; 253 0 BP. 

AC L26081; 
NI g799328 

DT 09-MAR-1994 (Rel . 38, Created) 

DT 12-MAY-1995 (Rel. 43, Last updated, Version 3) . . . 
With 1 enzymes: NOTI 



ggaattccctgcagcatgggctggttaactaggattgtctgtcttttctggggagtatta 
1 + + + + + + 60 

ccttaagggacgtcgtacccgaccaattgatcctaacagacagaaaagacccctcataat 

a GIPCSMGWLTRIVCLFWGVL 

b EFPAAWAG* LGLSVFSGEYY 

c NSLQHGLVN* DCLS F L G S I T - 

cttacagcaagagcaaactatcagaatgggaagaacaatgtgccaaggctgaaattatcc 
61 + + + + + + 12Q 

gaatgtcgttctcgtttgatagtcttacccttcttgttacacggttccgactttaatagg 

a LTARANYQNGKNNVP R L K L S 

b LQQEQT I RMGRTMCQG * NY P 

c YSK'SKLSEWEEQCAKAEIIL- 

tacaaagaaatgttggaatccaacaatgtgatcactttcaatggcttggccaacagctcc 
121 + + + + + + 180 

atgtttctttacaaccttaggttgttacactagtgaaagttaccgaaccggttgtcgagg 

a YKEMLE SNNVI TFNG LA NS S 

b TKKCWNPTM*SLSMAWPTAP- 

C QRNVGI QQCDHFQWLGQQLQ- 

agttatcataccttccttttggatgaggaacggagtaggctgtatgttggagcaaaggat 
181 + + + + + + 240 

tcaatagtatggaaggaaaacctactccttgcctcatccgacatacaacctcgtttccta 

a SYHTFLLDEERSRLYVGAKD 

b VIIPSFWMRNGVGCMLEQRI 

c LSYLPFG*GTE*AVCWS KGS- 
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Figure 16-2 

cacatattttcattcgacctggttaatatcaaggattttcaaaagattgtgtggccagta 
241 + + + + + + 300 

gtgtataaaagtaagctggaccaattatagttcctaaaagttttctaacacaccggtcat 

a HIFSFDLVNI KDFQKIVWPV 

b TYFHSTWLI SRI FKRLCGQY 

c HIFIRPG*YQGFSKDCVAS I - 

tcttacaccagaagagatgaatgcaagtgggctggaaaagacatcctgaaagaatgtgct 
301 + + + + + + 360 

agaatgtggtcttctctacttacgttcacccgaccttttctgtaggactttcttacacga 

a SYTRRDECKWAGKDI LKECA 

b LTPEEMNASGLEKTS * KNVL 

c LHQKR*MQVGWKRHPERMC*- 

aatttcatcaaggtacttaaggcatataatcagactcacttgtacgcctgtggaacgggg 

361 + + + + +- 4- 420 

ttaaagtagttccatgaattccgtatattagtctgagtgaacatgcggacaccttgcccc 

a NFI KVLiKAYNQTHLYACGTG 

b ISSRYLRHI IRLTCTPVERG 

C FHQGT*GI * SDSLtVRLWNGG- 

gcttttcatccaatttgcacctacattgaaattggacatcatcctgaggacaatattttt 
421 + + + + + + 480 

cgaaaagtaggttaaacgtggatgtaactttaacctgtagtaggactcctgttataaaaa 

a AFHPICTYIEIGHHPEDNIF 

b LFIQFAPTLKLDIILRTIFL- 

C FSSNLHLH*NWTSS * GQYF*- 

aagctggagaactcacattttgaaaacggccgtgggaagagtccatatgaccctaagctg 
481 + + + + + + 540 

ttcgacctcttgagtgtaaaacttttgccggcacccttctcaggtatactgggattcgac 

a KLENSHFENGRGKS PYDPKL 

b SWRTHILKTAVGRVHMTLSC 
c AGELTF*KRPWEESI *P*AA- 

ctgacagcatcccttttaatagatggagaattatactctggaactgcagctgattttatg 
541 + + + + + + 600 

gactgtcgtagggaaaattatctacctcttaatatgagaccttgacgtcgactaaaatac 
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Figure 16-3 

a LTASLLIDGEI.YSGTAADFM 

b *QHPF**MENYTLELQLILW- 

C DSIPFNRWRIILWNCS*FYG- 

99g c 9 a 9 acttt g ctatct t:ccgaactcttgggcaccaccacccaatcaggacagagcag 
601 + + + + + + 660 

cccgctctgaaacgatagaaggcttgagaacccgtggtggtgggttagtcctgtctcgtc 

a GRDFAI FRTLGHHHPIRTEQ 

b GETLLSSELLGTTTQSGQS S 

c ARLCYLPNSWAPPPNQDRAA- 

catgattccaggtggctcaatgatccaaagttcattagtgcccacctcatctcagagagt 
661 + + + + + + 720 

gtactaaggtccaccgagttactaggtttcaagtaatcacgggtggagtagagtctctca 

a HDSRWLNDPKFISAHLI SES 

b MIPGGSMIQSSLVPTSSQRV- 

c * FQVAQ* S K V H * CPPHLRE * - 

gacaatcctgaagatgacaaagtatactttttcttccgtgaaaatgcaatagatggagaa 
721 + + + + + + 780 

ctgttaggacttctactgtttcatatgaaaaagaaggcacttttacgttatctacctctt 

a DNPEDDKVYFFFRENAIDGE 

b TILKMTKYTFSSVKMQ*MEN- 

C QS * R * Q S ILFLP* KCNRWRT- 

cactctggaaaagctactcacgctagaataggtcagatatgcaagaatgactttggaggg 
781 + + + + + + 840 

gtgagaccttttcgatgagtgcgatcttatccagtctatacgttcttactgaaacctccc 

a HS GKATHARI GQ I CKND -F G G 

b TLEKLLTLE * VRYARMT LE G 

C LWKSYSR*NRSDMQE* LWRA- 

cacagaagtctggtgaataaatggacaacattcctcaaagctcgtctgatttgctcagtg 
841 + + + + + + 900 

gtgtcttcagaccacttatttacctgttgtaaggagtttcgagcagactaaacgagtcac 

a HRSLVNKWTTFLKARLiI CSV 

b TEVW* INGQHSSKLV* FAQC 

c QKSGE*MDNIPQSSSDLLSA- 
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Figure 16-4 

ccaggtccaaatggcattgacactcattttgatgaactgcaggatgtattcctaatgaac 

901 + + + + + + 960 

ggtccaggtttaccgtaactgtgagtaaaactacttgacgtcctacataaggattacttg 

a PGPNGIDTHFDELQ'DVFLMN 

b QVQMALTLILMNCRMYS**T- 

c RSKWH*HSF * *TAGC I PNEL- 

tttaaagatcctaaaaatccagttgtatatggagtgtttacgacttccagtaacattttc 

961 + + + + + + 1020 

aaatttctaggatttttaggtcaacatatacctcacaaatgctgaaggtcat tgtaaaag 

a FKDPKNPVVYGVFTTS SN I F 

b LKILKIQLYMECLRLPVTFS- 

c *RS*KSSCIWSVYDFQ*HFQ- 

aagggatcagccgtgtgtatgtatagcatgagtgatgtgagaagggtgttccttggtcca 

1021 + + + + + + 1080 

ttccctagtcggcacacatacatatcgtactcactacactcttcccacaaggaaccaggt 

a KGSAVCMYSMSDVRRV FLG P 

b RDQPCVCIA*VM*EGCSLVH- 

c GIS'RVYV*HE*CEKGVPWSI- 

tatgcccacagggatggacccaactatcaatgggtgccttatcaaggaagagtcccctat 

1081 + + + + + + 1140 

atacgggtgtccctacctgggttgatagttacccacggaatagttccttctcaggggata 

a- YAHRDGPNYQWVPYQGRVPY 

b M-PTGMDPTINGCLI KEES P I 

c CPQGWTQLSMGALSRKS PLS- 

ccacggccaggaacttgtcccagcaaaacatttggtggttttgactctacaaaggacctt 

1141 + + + + 4- + 1200 

ggtgccggtccttgaacagggtcgttttgtaaaccaccaaaactgagatgtttcctggaa 

a PRPGTCPSKTFGGFDSTKDL 

b HGQELVPAKHLVVLTLQRTF- 

c TA'RNLSQQNIWWF*LYKGPS- 

cctgacgatgtitataacctttgcaagaagtcatccagccatgtacaatccagtgtttcct 

1201 + + + + + + 1260 

ggactactacaatattggaaacgttcttcagtaggtcggtacatgttaggtcacaaagga 
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Figure 16-5 



a PDDVI TFARSHPAMYNPVFP 

b LMML* PLQEVIQPCTIQCFL 

c **CYNLCKKSSSHVQSSVSY 



atgaacaatcgcccaatagtgatcaaaacggatgtaaattatcaatttacacaaattgtc 
h + + h — + + 

tactrgttagcgggttatcactagttttgcctacatttaatagttaaatgtgtttaacag 



a MNNRPIVIKTDVNYQFTQIV 

b *-IAQ**SKRM*IINLHKLS- 

c EQSPNSDQNGCKLS I YTNCR- 



gtagaccgagtggatgcagaagacggacagtatgatgttatgtttatcggaacagatgtt 
t_ + + + ^ ^ 

catc:ggctcacctacgtcttctacctgtcatactacaatacaaatagccttgtctacaa 



a VDRVDAEDGQYDVMF I GTDV 

b *TEWM QKMDSMMLCLSEQMIi- 

c RPSGCRRW TV*CYVYRNRCW- 



gggaccgttcttaaagtagtttcaattcctaaggagacttggtatgatttagaagaggtt 
+ + + + 1- + 

ccccggcaagaatttcatcaaagttaaggattcctctgaaccatactaaatcttctccaa 



a GTVLKVVS I PKETWYDLEEV 

b G ? F L K * FQFLRRLGM I * KRF- 

c DRS * SS FNS * G D L V * FRRGS- 



ctgczggaagaaatgacagtttttcgggaaccgactgctatttcagcaatggagctttcc 
+ h + H h - . + 

gacgaccttctttactgtcaaaaagcccttggctgacgataaagtcgttacctcgaaagg 



a LLEEMTVFREPTAISA M,* E L S 

b CV?KK*QFFGNRLLFQQWS FP 

c AGRNDS FSGTDCYF SNGAFH- 



actaagcagcaacaactatatattggttcaacggctggggttgcccagctccctttacac 
+ + + + + + 

tgat-cgtcgttgttgatatataaccaagttgccgaccccaacgggtcgagggaaatgtg 



a TKQ. QQLYIGSTAGVAQLPLH 

b LSSNNYI LiVQRLGLiPS SLYT- 

c *AATTIYWFNGWGCPAPFTP- 
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Figure 16-6 

cggtgtgatatttacgggaaagcgtgtgctgagtgttgcctcgcccgagacccttactgt 

1561 + + + + + + 1620 

gccacactataaatgccctttcgcacacgactcacaacggagcgggctctgggaatgaca 

a RCDI YGKACAECCLARDPYC 

b GVI FTGKRVLSVAS PETLTV 

c V*YLRESVC*VLPRPRPLLC- 

gcttgggatggttctgcatgttctcgctattttcccactgcaaagagacgcacaagacga 

1621 + + + + + + 1680 

cgaaccctaccaagacgtacaagagcgataaaagggtgacgtttctctgcgcgttctgct 

a AWDGSACSRYFPTAKRRTRR 

b LGMVLHVLAI FPLQRDAQD D 

C LGWFCMFSLFSHC KETH KTT- 

caagatataagaaatggagacccactgactcactgttcagacttacaccatgataatcac 

1681 + + + + + + 1740 

gttctatattctttacctctgggtgactgagtgacaagtctgaatgtggtactattagtg 

a QDI RNGDPLTHCSDLHHDNH 

b KI*EMETH*LTVQTYTMIIT- 

c RYKKWRPTDSLFRLTP* * S P - 

catggccacagccctgaagagagaatcatctatggtgtagagaatagtagcacatttttg 

1741 + + + + + + 1800 

gtaccggtgtcgggacttctctcttagtagataccacatctcttatcatcgtgtaaaaac 

a ~ HGHSPEERI IYGVENSSTFL 

b '"" MATALKRES SMV* RI VAHFW 

C WPQP*RENHLWCRE**HIFG- 

gaatgcagtccgaagtcgcagagagcgctggtctattggcaattccagaggcgaaatgaa 

1801 + + + + + + 1860 

cttacgtcaggcttcagcgtctctcgcgaccagataaccgttaaggtctccgctttactt 

a ECSPKSQRALVYWQFQRRNE 

b NAVRSRRERWS IGNSRGEMK- 

c MQSEVAESAGLLAI PEAK* R - 

gagcgaaaagaagagatcagagtggatgatcatatcatcaggacagatcaaggccttctg 

1861 + + + + + + 1920 

ctcgcttttcttctctagtctcacctactagtatagtagtcctgtctagttccggaagac 
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Figure 16-7 

a ERKEEIRVDDHI IRTDQGLL 

b SEKKRSEWMI I SSGQIKAFC- 

c AKRRDQSG* SYHQD RSRPSA- 

ctacgtagtctacaacagaaggattcaggcaattacctctgccatgcggtggaacatggg 

1921 + + + + + + 1980 

gatgcatcagatgttgtcttcctaagtccgttaatggagacggtacgccaccttgtaccc 

a LRSLQQKDSGNYLCHAVEHG 

b YVVYNRRIQAITSAMRWNMG- 

c T * STTEGFRQLPLP CGGTVJV- 

ttcatacaaactcttcttaaggtaaccctggaagtcattgacacagagcatttggaagaa 

1981 + + + + + 2040 

aagtatgtttgagaagaattccattgggaccttcagtaactgtgtctcgtaaaccttctt 

a FIQTLLKVTLEVIDTEHLEE 

b SYKLFLR* PWKSLTQSIWKN- 

c HTNSS*GNPGSH* HRAFGRT- 

cttcttcataaagatgatgatggagatggctctaagaccaaagaaatgtccaatagcatg 

2041 + + + + + + 2100 

gaagaagtatttctactactacctctaccgagattctggtttctttacaggttatcgtac 

a LLHKDDDGDGS KTKEMSNSM 

b FFIKMMME MALRPKKCPIA* 

C SS*R**WRWL*DQRNVQ*HD- 

acacctagccagaaggtctggtacagagacttcatgcagctcatcaaccaccccaatctc 

2101 + + + + + + 2160 

tgtggatcggtcttccagaccatgtctctgaagtacgtcgagtagttggtggggttagag 

a TPSQKVWYRDFMQLIN H " P N L 

b H LARR SGT ET S C S S S TT P I S 

C T*PEGLVQRLHAAHQPPQSQ- 

aacacgatggatgagttctgtgaacaagtttggaaaagggaccgaaaacaacgtcggcaa 

2161 + + + + + + 2220 

ttgtgctacccactcaagacacttgttcaaaccttttccctggcttttgttgcagccgtt 

a NTMDEFCEQVWKRDRKQRRQ 

b TRWMSSVNKFGKGTENNVGK- 

C H D G * V L * TSLEKGPKTTSAK- 
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Figure 16-8 

aggccaggacataccccagggaacagtaacaaatggaagcacttacaagaaaataagaaa 

2221 + + + + + + 2280 

tccggtcctgtatggggtcccttgtcattgtttaccttcgtgaatgttcttttattcttt 

a RPGHTPGNSNKWKHLQENKK 

b GQDI PQGTVT'NGSTYKK I RK 

c ARTYPREQ* QMEALTRK* E R - 

ggtagaaacaggaggacccacgaatttgagagggcacccaggagtgtctgagctgcatta 

2281 + + + + + + 2340 

ccatctttgtcctcctgggtgcttaaactctcccgtgggtcctcacagactcgacgtaat 

a GRNRRTHEFERAPRSV * A A L 

b VETGGPTNLRGHPGVSELH Y 

c * KQEDPRI * EGTQECLSC I T - 

cctctagaaacctcaaacaagtagaaacttgcctagacaataactggaaaaacaaatgca 

2341 + + + + + + 2400 

ggagatctttggagtttgttcatctttgaacggatctgttattgaccttt ttgtttacgt 

a PLETSNK* K L A * T I TGKTNA 

b L* KPQTSRNLPRQ* LEKQMQ 

c S RNLKQVET CLDNNW KN KCN- 

atatacatgaacttttttcatggcattatgtggatgtttacaatggtgggaaattcagct 

2401 + + + + + + 2460 

tatatgtacttgaaaaaagtaccgtaatacacctacaaatgttaccaccctttaagtcga 

a< lYMNFFHGIMWMFTM'VGNSA 

b YT* TFFMALCGCLQWWE I QL 

c IHELFSWHYVDVYNGGKFS * - 

gagttccaccaattataaattaaatccatgagtaactttcctaataggct.tttttttcct 

2461 + + + + + + 2520 

ctcaaggtggttaatatttaatttaggtactcattgaaaggattatccgaaaaaaaagga 

a EFHQL* IKSMSNFPNRLFFP 

b SSTNYKLNP* VTFLI GFFFL- 

c VPPIIN*IHE*LS**AFFS*- 

aataccaccg 

2521 + 2530 

ttatggtggc 
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Figure 16-9 

a N T T 

b I P P 

c Y H 

Enzymes that do cut : 

NONE 

Enzymes that do not cut : 
Not I 
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Figure 17-1 HUPF-1 

(Linear) MAP of: hsu59323 . em_hum2 check: 6138 from: 1 to: 3602 

RL;HSU59323 - Human homolog of yeast UPF1 (HUPF-I) mRNA, complete cds . 

ID HSU59323 standard; RNA; HUM; 3 602 BP. 

XX AC U59323; 

XX NI gl633577 . . . 

gcggcggctcggcactgttacctctcggtccggctggcgccgcgggcggtttggtccttt 
! + + + + + + 60 

cgccgccgagccgtgacaatggagagccaggccgaccgcggcgcccgccaaaccaggaaa 

a AAARHCYLSVRLAPRAVWSF 

b RRLGTVTSRSGWRRGRFGPF- 

c GGSALLPLGPAGAAGGLVL S- 

ccgggcgcgcgggggcgacagcggcagcgacccgaggcctgcggcctaggcctcagcgcg 

61 + + + + + 120 

ggcccgcgcgcccccgctgtcgccgtcgctgggctccggacgccggatccggagtcgcgc 

a PGARGRQRQRPEACGLGLSA 

b RARGGDSGSDPRPAA*ASAR- 

c GRAGATAAATRGLR PR PQRG- 

gcggcgggctcgagtgcagcgcggaaccggcccgagggccctacccggaggcaccatgag 

121 + + + + + +- 180 

cgccgcccgagctcacgtcgcgccttggccgggctcccgggatgggcctccgtggtactc 

a AAGSSAARNRPEGPTRRHHE 

b RRARVQRGTGPRALPGGTMS 

c GGLECSAEPARGPYPEAP*A- 

cgtggaggcgtacgggcccagctcgcagactctcactttcctggacacggaggaggccga 

181 + + + + + + 240 

gcacctccgcatgcccgggtcgagcgtctgagagtgaaaggacctgtgcctcctccggct 

a RGGVRAQLADSHFPGHGGGR 

b VEAYGPS SQTLTFLDTEEAE- 

c WRRTGPARRLSLSWTRRRPS- 

gctgcttggcgccgacacacagggctccgagttcgagttcaccgactttactcttcctag 

241 + + + + + + 300 

cgacgaaccgcggctgtgtgtcccgaggctcaagctcaagtggctgaaatgagaaggatc 
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Figure 17-2 

a AAWRRHTGLRVRVHRLYSS* 

b LLGADTQGSEFEFTDFTLPS 

C CLAPTHRAPSSSS PTLLFLA- 

ccagacgcagacgccccccggcggccccggcggcccgggcggtggcggcgcgggaagccc 
301 + + + + + + 36Q 

ggtctgcgtctgcggggggccgccggggccgccgggcccgccaccgccgcgcccttcggg 

a PDADAPRRPRRPGRWRRGKP 

b QTQTPPGGPGGPGGGGAG.S P 

c R RR R P PAA P AARA V AA R E A R - 

gggcggcgcgggcgccggcgctgcggcgggacagctcgacgcgcaggttgggcccgaagg 
361 + + + + + + 420 

cccgccgcgcccgcggccgcgacgccgccctgtcgagctgcgcgtccaacccgggcttcc 

a GRRGRRRCGG. TARRAGWARR 

b GGAGAGAAAGQLDAQVGPEG- 

c AARAPALRRDS STR RLG P K A - 

catcctgcagaacggggccgtggacgacagtgtagccaagaccagccagttgttggctga 
421 + + + + + + 

9taggacgtcttgccccgacacctgctgtcacatcggttctggtcggtcaacaaccgact 

a HPAERGCGRQCSQDQ PVVG * 

b I LQNGAVDDSVAKTSQLLAE 

c SCRTGLWTTV*PRPASCWLS- 

gttgaacttcgaggaagacgaagaagacacctattacacgaaggacctccccatacacgc 
4S1 + + + + + + 5 40 

caacttgaagctccttctacttcttctgtggataatgtgcttcctggaggggtatgtgcg 

a V E L R G R * R R H L L H E G P P H T R 

b LNFEEDEEDTYYTKDLP IHA- 

c~ * TSRKMKKTPI TRRTS PYTP- 

ctgcagttactgtggaatacacgatcctgcctgcgtggtttactgtaataccagcaagaa 

541 + + + + + + 600 

gacgtcaatgacaccttatgtgctaggacggacgcaccaaatgacattatggtcgttctt 

a LQLLWNTRSCLRGLL*YQQE 

b CSYCGIHDPACVVYCNTSKK- 

C AVTVEYTILPAWFTVI PARS- 
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Figure 17-3 

gtggttctgcaacggacgtggaaatacttctggcagccacattgtaaatcaccttgtgag 

601 + + + + + + 660 

caccaagacgttgcctgcacctttatgaagaccgtcggtgtaacatttagtggaacactc 

a VVLQRTWKYFWQPHCKS PCE 

b W FCNGRGNT S GSH I VN HLVR 

C GSATDVE I LLAATL* I TL* G - 

ggcaaaatgcaaagaggtgaccctgcacaaggacgggcccctgggggagacagtcctgga 

661 + + + + + + 720 

ccgttttacgtttctccactgggacgtgttcctgcccggggaccccctctgtcaggacct 

a GKMQRGDPAQGRAPGGDSPG 

b AKCKEVTLHKDGPLGETVLE 

c QNAKR* PCTRTGPWGRQSWS- 

gtgctacaactgcggctgtcgcaacgtcttcctcctcggcttcatcccggccaaagctga 

721 + + + + + + 780 

cacgatgttgacgccgacagcgttgcagaaggaggagccgaagtagggccggt t tcgact 

a VLQLRLSQRLPPRLHPGQS * 

b CYNCGCRNVFLLGFI PAKAD 

c ATTAAVATS S SSAS S R P KLT- 

ctcagtggtggtgctgctgtgcaggcagccctgtgccagccagagcagcctcaaggacat 

781 + + + + + + 840 

gagtcaccaccacgacgacacgtccgtcgggacacggtcggtctcgtcggagttcctgta 

a" LSGGAAVQAALCQPEQPQGH 

b"" SVVVLLCRQPCASQSSLKDI- 

c QWWCCCAGSPVPA RAASRTS- 

caactgggacagctcgcagtggcagccgctgatccaggaccgctgcttcctgtcctggct 

841 + + + + + + 900 

gttgaccctgtcgagcgtcaccgtcggcgactaggtcctggcgacgaaggacaggaccga 

a QLGQLAVAAADPGPLLPVLA 

b NWDSSQWQPLIQDRCFL.SWL- 

C TGTARSGSR* SRTA ASCPGW- 

ggtcaagatcccctccgagcaggagcagctgcgggcacgccagatcacggcacagcagat 

901 + + + + + + 960 

ccagttctaggggaggctcgtcctcgtcgacgcccgtgcggtctagtgccgtgtcgtcta 
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Figure 17-4 

a GQDPIiRAGAAAGTPDHGTAD 

b VKIPSEQEQLiRARQ ITAQQI 

c SRSPPSRSSCGHARSRHSRS- 

caacaagctggaggagctgtggaaggaaaacccttctgccacgctggaggacctggagaa 

961 + + + + + + 1020 

gttgttcgacctcctcgacaccttccttttgggaagacggtgcgacctcctggacctctt 

a QQAGGAVEGKPFCHAGG PGE - 

b NKIiEELVJKENPSATLEDLEK- 
c TSWRS CGRKTLLP RWRTWR S - 

gccgggggtggacgaggagccgcagcatg.tcctcctgcggtacgaggacgcctaccagta 

1021 + + + + + + 1080 

cggcccccacctgctcct cggcgtcgt acaggaggacgccatgc tec tgcggatggt cat 

a AGGGRGAAACPPAVRGRLPV 

b PGVDEEPQHVLIiRYEDAYQY 

c RGWTRSRSMSSCGTRTPTST- 

ccagaacatattcgggcccctggtcaagctggaggccgactacgacaagaagctgaagga 

1081 + + + + + + 1140 

ggtcttgtataagcccggggaccagttcgacctccggctgatgctgttcttcgacttcct 

a PEHIRAPGQAGGRLRQEAEG 

b QNIFGPLVKLEADYDKKLKE- 

C RTYSGPWSSWRPTTTRS*RS- 

gtcccagactcaagataacatcactgtcaggtgggacctgggccttaacaagaagagaat 

1141 + + + + + + 1200 

cagggtctgagttctattgtagtgacagtccaccctggacccggaattgttcttctctta 

a VPDSR*HHCQVGPGP* Q E E N 

b SQTQDNITVRWDLGIiNK KRI- 

c PRLKITSL.SGGTWALTRRES- 

cgcctacttcactttgcccaagactgactctgacatgcggctcatgcagggggatgagat 

1201 + + + + + + 1260 

gcggatgaagtgaaacgggttctgactgagactgtacgccgagtacgtccccctactcta - 

a RLLHFAQD * L * HAAHAGG * D 

b AYFT.LPKTDSDMRLMQGDEI 

C PTSLC PRL.TLTCGS CRGMRY- 
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Figure 17-5 

atgcctgcggtacaaaggggaccttgcgcccctgtggaaagggatcggccacgtcatcaa 

1261 + + + + + + 1320 

tacggacgccatgtttcccctggaacgcggggacacctttccctagccggtgcagtagtt 

a MPAVQRGPCAPVERDRPR HQ 

b CLRYKGDLAPLWKG I GHVI K - 

C ACGTKGTLRPCGKGSATSSR- 

ggtccctgataattatggcgatgagatcgccattgagctgcggagcagcgtgggtgcacc 

1321 + + + + + + 1380 

ccagggactattaataccgctactctagcggtaactcgacgcctcgtcgcacccacgtgg 

a G P * * LWR* DRH*AAEQRGCT 

b VPDNYGDE I -A I ELR S SVGAP 

c SLIIMAMRSPLSCGAAWVHL- 

tgtggaggtgactcacaacttccaggtggattttgtgtggaagtcgacctcctttgacag 

1381 + + + + + + 1440 

acacctccactgagtgttgaaggtccacctaaaacacaccttcagctggaggaaactgtc 

a CGGDSQLPGGFCVEVDLL* Q 

b VEVTHNFQVDFVWKSTS FDR- 

C WR* LTTSRWI LCGSRP PLTG- 

gatgcagagcgcattzgaaaacgtttgccgtggatgagacctcggtgtctggctacatcta 

1441 + + + + + + 1500 

ctacgtctcgcgtaacttttgcaaacggcacctactctggagccacagaccgatgtagat 

a : DAERI ENVCRG*DLiGVWLiHL 

b' MQSALKTFAVDETSVSGY I Y 

C CRAH * KRLPWMRPRCLiATST- 

ccacaagctgttgggccacgaggtggaggacgtaatcaccaagtgccagctgcccaagcg 

1501 + + + + + - + 1560 

ggtgttcgacaacccggtgctccacctcctgcattagtggttcacggtcgacgggttcgc 

a PQAVGPRGGGRNHQVPAAQA 

b HKLLGHEVEDVITKCQL PKR 

c TSCWATRWRT*SPSASCPSA- 

cttcacggcgcagggcctccccgacctcaaccactcccaggtttatgccgtgaagactgt 

1561 + + + + + + 1620 

gaagtgccgcgtcccggaggggctggagttggtgagggtccaaatacggcacttctgaca 
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Figure 17-6 

a LHG AGPPRPQPLPGLCREDC 

b FTA QGLPDLNHS QVYAVKTV- 

c SRRRASPTS TTPRFMP * RLC- 

gctgcaaagaccactgagcctgatccagggcccgccaggcacggggaagacggtgacgtc 

1621 + + + + + .+ 1680 

cgacgtttctggtgactcggactaggtcccgggcggtccgtgccccttctgccactgcag 

a AAKTTEPDPGPARHGEDGDV 
b LQRPLSLIQGPPGTGKTVTS 
c C K D H * A * SRARQARGRR* R R - 

ggccaccatcgtctaccacctggcccggcaaggcaacgggccggtgctggtgtgtgctcc 

1681 + + + + + + 1740 

ccggtggtagcagatggcggaccgggccgttccgttgcccggccacgaccacacacgagg 

a GHHRLPPGPARQRAGAGVCS 
b AT IVYHLARQGNG PVLVCAP 

C PPSSTTWPGKATGRCWCVLR- 

gagcaacatcgccgtggaccagctaacggagaagatccaccagacggggctaaaggtcgt 

1741 + + + + + + 1800 

ctcgttgtagcggcacccggtcgattgcctcttctaggtggtctgccccgatttccagca 

a EQHRRGPANGEDPPDGAKGR 

b SNIAVDQLTEKI HQTGLKVV- 

c ATS PWTS* RRRSTRRG* RSC- 

gcgcctctgcgccaagagccgtgaggccatcgactccccggtgtcttttctggccctgca 

1801 + + + + + + 1860 

cgcggagacgcggttctcggcactccggtagctgaggggccacagaaaagaccgggacgt 

a APLRQEP*GHRLPGVFS'GPA 
b R LCAKSREA I D S PV S FIj'ALH 

C ASAPRAVRPSTPRCLFW PCT- 

caaccagatcaggaacatggacagcatgcctgagctgcagaagctgcagcagctgaaaga 

1861 + + + + + + 1920 

gttggtctagtccttgtacctgtcgtacggactcgacgtcttcgacgtcgtcgactttct 

a QPDQEHGQHA* AAEAAAAER 

b NQIRNMDSMPELQKLQQLKD- 

c TRSGTWTACLSCRSCSS*KT- 
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Figure 17-7 

cgagactggggagctgtcgtctgccgacgagaagcggtaccgggccttgaagcgcaccgc 

1921 + + + + + + 1980 

gctctgacccctcgacagcagacggctgctcttcgccatggcccggaacttcgcgtggcg 

a RDWGAVVCRREAVPGLEAHR 

b ETGELSSADEKRYRALKRTA 

C RLGSCRLPTRS'GTGP * SAPQ- 

agagagagagctgctgatgaacgcagatgtcatctgctgcacatgtgtgggcgccggtga 

1981 + + + + + + 2040 

tctctctctcgacgactacttgcgtctacagtagacgacgtgtacacacccgcggccact 

a RERAADERRCHLLHMCGRR* 

b ERELLMNADVI CCTCVGAGD 

c RESC**TQMSSAAHVWAPVT- 

cccgaggccggccaagatgcagttccgctccattttaatcgacgaaagcacccaggccac 

2041 + + + + + + 2100 

gggctccgaccggttctacgtcaaggcgaggtaaaattagctgctttcgtgggtccggtg 

a PEAGQDAVPLHFNRRKHPGH 

b PRLAKMQFRS ILIDESTQAT 

c RGWPRCSSAPF*STKAPRPP- 

cgagccggagtgcatggttcccgtggtcctcggggccaagcagctgatccttgtaggcga 

2101 + + + + + + 2160 

gctcggcctcacgtaccaagggcaccaggagccccggttcgtcgactaggaacatccgct 

a RAGVHG S R G P RGQAAD P C R R 

b EPECMVPVVLGAKQLILVGD- 

C SRSAWFPWSSGPSS*SL*AT- 

ccactgccagctgggcccagtggtgatgtgcaagaaggcggccaaggccgggctgtcaca 

2161 + + + + + + 2220 

ggtgacggtcgacccgggtcaccactacacgttcttccgccggttccggcccgacagtgt 

a PLPAGPSGDVQEGGQGRAVT 

b HCQLGPVVMCKKAAKAGLSQ 

c TAS WAQW * CARRRPRPGCH S- 

gtcgctcttcgagcgcctggtggtgctgggcatccggcccatccgcctgcaggtccagta 

2221 + + + + + + 2280 

cagcgagaagctcgcggaccaccacgacccgtaggccgggtaggcggacgtccaggtcat 
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Figure 17-8 

a VALRAPGGAGHPAHPPAGPV 

b SLFERLVVLGIRP I RLQVQY 

c RSSSAWWCWASGPSACRSST- 

ccggatgcaccctgcactcagcgccttcccatccaacatcttctacgagggctccctcca 

2281 + + + + + + 2340 

ggcctacgtgggacgtgagtcgcggaagggtaggttgtagaagatgctcccgagggaggt 

a PDAPCTQRLPIQHLLRGLPP 

b RMHPALSAFPSNI FYEGSLQ- 

c GCTLHSAPSHPTS S TRAP S R - 

gaatggtgtcactgcagcggatcgtgtgaagaagggatttgacttccagtggccccaacc 

2341 + + + + + + 2400 

cttaccacagtgacgtcgcctagcacacttcttccctaaactgaaggtcaccggggttgg 

a EWCHCSGSCEEGI*LPVAPT 

b NGVTAADRVKKGFDFQWPQP.- 

C MVSLQRIV* RRDLTS SGPN P - 

cgataaaccgatgttcttctacgtgacccagggccaagaggagattgccagctcgggcac 

2401 + + + + — + + 2460 

gctatttggctacaagaagatgcactgggtcccggttctcctctaacggtcgagcccgtg 

a R*TDVLLRDPGPRGDCQLGH 

b DKPMFFYVTQGQEE IAS SGT 

C INRCSST* PRAKRRLPARAP- 

ctcctacctgaacaggaccgaggctgcgaacgtggagaagatcaccacgaagttgctgaa 

2461 + + + + + + 2520 

gaggatggacttgtcctggctccgacgcttgcacctcttctagtggtgcttcaacgactt 

a LLPEQDRGCERGEDHH E ~ V A E 

b SYLNRTEAANVEKI TTK LLK 

C PT*TGPRLRTWRRS PRSC * R- 

ggcaggcgccaagccggaccagattggcatcatcacgccctacgagggccagcgctccta 

2521 + + + + + + 2580 

ccgtccgcggttcggcctggtctaaccgtagtagtgcgggatgctcccggtcgcgaggat 

a GRRQAGPDWHHHALRGPALL 

b AGAKPDQIGI ITPYEGQRSY 

C QAPSRTRLASSRPTRASAPT- 
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Figure 17-9 

cctggtgcagtacatgcagttcagcggctccctgcacaccaagctctaccaggaagtgga 
2581 + + + + + + 2640 

ggaccacgtcatgtacgtcaagtcgccgagggacgtgtggttcgagatggtccttcacct 

a PGAVHAVQRLPAHQAL PGSG 

b LVQYMQFSGSLHTKLYQEVE- 

C WCSTCSSAAPCTPSSTRKWR- 

gatcgccagtgtggacgcctttcagggacgcgagaaggacttcatcatcctgtcctgtgt 
2641 + + + + + + 2700 

ctagcggtcacacctgcggaaagtccctgcgctcttcctgaagtagtaggacaggacaca 

a DRQCGRLSGTREGLHH PVLC 

b IASVDAFQGREKDFI I LSCV- 

C SPVWTPFRDARRTSSSCPVC- 

gcgggccaacgagcaccaaggcattggctttttaaatgaccccaggcgtctgaacgtggc 
2701 + + + + + + 2760 

cgcccggttgctcgtggttccgtaaccgaaaaatttactggggtccgcagacttgcaccg 

a AGQRAPRHWLFK* PQASERG 

b RANEHQGIGFLNDPRRLNVA- 

C GPTSTKALAF*MTPGV*TWP- 

cctgaccagagcaaggtatggcgtcatcattgtgggcaacccgaaggcactatcaaagca 
2761 + + + + + + 2820 

ggactggtctcgttccataccgcagtagtaacacccgttgggcttccgtgatagtttcgt 

a - PDQSKVWRHHCGQPEGTIKA 
t> LTRARYGVI IVGNPKALSKQ 

c * P EQGMAS SLWATRRHYQS S- 

gccgctctggaaccacctgctgaactactataaggagcagaaggtgctggtggaggggcc 
2821 + + + + + + 2880 

cggcgagaccttggtggacgacttgatgatattcctcgtcttccacgaccacctccccgg 

a AALEPPAELL *GAEGAGGGA 

b PLWNHLLNYYKEQKVLVEGP 
C RSGTTC*TTI RSRRCWWRGR- 

gctcaacaacctgcgtgagagcctcatgcagttcagcaagccacggaagctggtcaacac 
2881 + + + + + + 2940 

cgagttgttggacgcactctcggagtacgtcaagtcgttcggtgccttcgaccagttgtg 
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Figure 17-10 

a AQQPA* EPHAVQQATEAG QH 

b LNNLRESLMQFSKPRKLVNT- 

c STTCVRASCSSASHGSWSTL- 

tatcaacccgggagcccgcttcatgaccacagccatgtatgatgcccgggaggccatcat 
2941 + + + + + + 3000 

atagttgggccctcgggcgaagtactggtgtcggtacatactacgggccctccggtagta 

a YQPGSPLHDHSHV * C P G G H H 

b INPGARFMTTAMYDAREAI I 

c STREPAS*PQPCMMPGRPSS- 

cccaggctccgtctatgatcggagcagccagggccggccttccagcatgtacttccagac 
3001 + + + + + + 3060 

gggtccgaggcagataccagcctcgtcggtcccggccggaaggtcgtacatgaaggtctg 

a PRLRL* SEQPGPAFQHVLPD 

b PGSVYDRSSQGRPSSMYFQT 

c QAPSMIGAARAGLPACTSRP- 

ccatgaccagattggcatgatcagtgccggccctagccacgtggctgccatgaacattcc 
3061 + + + + + + 3120 

ggtactggtctaaccgtactagtcacggccgggatcggtgcaccgacggtacttgtaagg 

a P*PDWHDQCRP*PRGCHEHS 

b HDQIGMISAGPSHVAAMNIP- 

c MTRLA* SVPALATWLP* TF P- 

catccccttcaacctggtcatgccacccatgccaccgcctggctattttggacaagccaa 
3121 + + + + + + 3180 

gtaggggaagttggaccagtacggtgggtacggtggcggaccgataaaacctgttcggtt 

a H PLQPGHATHATAWLFW T S Q 

b I P FNLVM P P M P P PGY FG QAN 

C SP. STW SCHPCHRLAI LDKPT- 

cgggcctgctgcagggcgaggcaccccgaaaggcaagactggtcgtgggggacgccagaa 
3181 + + + + + + 3240 

gcccggacgacgtcccgctccgtggggctttccgttctgaccagcaccccctgcggtctt 

a RACCRARHPERQDWSWGTPE 

b GPAAGRGTPKGKTGRGGRQK 

c GLLQGEAPRKARLVVGDA RR- 
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Figure 17-11 

gaaccgctttgggcttcctggacccagccagactaacctccccaacagccaagccagcca 
3241 + + + + + + 3300 

cttggcgaaacccgaaggacctgggtcggtctgattggaggggttgtcggttcggtcggt 

a EPLWASWTQPD*PPQQPSQP 

b NRFGLPGPSQTNLPNSQASQ- 

c TALGFLDPARLTSPTAKPAR- 



ggatgtggcgtcacagcccttctctcagggcgccctgacgcagggctacatctccatgag 
+ + + + + + 

cctacaccgcagtgtcgggaagagagtcccgcgggactgcgtcccgatgtagaggtactc 



a GCGVTALLSGRPDAGLHLHE 

b DVASQPFSQGALTQGY I SMS 

C MWRHSPS LRAP* RRATS P * A- 



ccagccttcccagatgagccagcccggcctctcccagccggagctgtcccaggacagtta 
+ + + h (. h 

ggtcggaagggcctactcggtcgggccggagagggtcggcctcgacagggtcctgtcaat 



a PAFPDEPARPLPAGAVPGQL 

b QPSQMSQPGLSQPELSQDSY- 

C SLPR* ASPASPSRSCPRTVT- 



ccttggtgacgagtttaaatcacaaatcgacgtggcgctctcacaggactccacgtacca 
+ h j_ + ^ ^ 

ggaaccactgctcaaatttagtgtttagctgcaccgcgagagtgtcctgaggtgcatggt 



a P W * R V * ITNRRGALTGLHVP 

b LGDEFKSQIDVALSQDSTYQ 
c LVTSLNHKSTWRSHRTPRTR 



gggagagcgggcttaccagcatggcggggtgacggggctgtcccagtattaaaaggtggc 
+ _ . + + + + ^ 

ccctctcgcccgaatggtcgtaccgccccactgccccgacagggtcataattttccaccg 



a GRAGLPAWRGDGAVPVL KGG 

b GERAYQHGGVTGLSQY * KVA 

c ESGLTSMAG*RGCPS I KRWR- 



ggcggaagagctaagcaacgtggcttagtccatcagcatcttattctgggtaataaaaaa 
+ + + + + + 

ccgccttctcgattcgttgcaccgaatcaggtagtcgtagaataagacccattatttttt 
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Figure 17-12 

a GGRAKQRGLVHQHL I LGNKK 

b AEELSNVA*SISILFWVIKN- 

c RKS*ATWLSPSASYSG**KM- 

tg 

3601 -- 3602 
ac 

a 
b 
c 
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Figure 18-1 HMG 

(Linear) MAP of: hshmgicr . em_huml check: 9603 from: 1 to: 1200 

RL ; HSHMGICR - H. sapiens HMGI-C mRNA for high mobility group protein I-C 

ID HSHMGICR standard; RNA; HUM; 1200 BP. 

XX 

AC Z31595; 
XX 

NI g468705 ... 

March 20, 1998 10:31 

tcttgaatcttggggcaggaactcagaaaacttccagcccgggcagcgcgcgcttggtgc 
! + + + + + + 60 

agaacttagaaccccgtccttgagtcttttgaaggtcgggcccgtcgcgcgcgaaccacg 

a S * ILGQELRKLPARAARAWC 

b LESWGRNSENFQPGQRALGA- 

c LNLGAGTQKTSSPGSARLVQ- 

aagactcaggagctagcagcccgtccccctccgactctccggtgccgccgctgcctgctc 
61 + + + + + + 120 

ttctgagtcctcgatcgtcgggcagggggaggctgagaggccacggcggcgacggacgag 

a KTQELAARPPPTLRCRRCLL 
b RLRS*QPVPLRLrSGAAAACS 
c DSGASSPSPSDSPVPPLPAP- 

ccgccaccctaggaggcgcggtgccacccactactctgtcctctgcctgtgctccgtgcc 

121 + + + + -- + + 180 

ggcggtgggatcctccgcgccacggtgggtgatgagacaggagacggacacgaggcacgg 

a PPP* EARCHPLLCPLPVLRA 

b RHPRRRGATHYSVLCLCSVP- 

C ATLGGAVP PTTLS SACA PC P - 

cgaccctatcccggcggagtctccccatcctcctttgctttccgactgcccaaggcactt 

181 + + + + + + 240 

gctgggatagggccgcctcagaggggtaggaggaaacgaaaggctgacgggttccgtgaa 

a RPYPGGVS PSSFAFRLPKAL 

b DPIPAESPHPPLLSDCPRHF- 

c TLSRRSLP I LLCFPTAQGTF- 

tcaatctcaatctcttctctctctctctctctctctctgtctctctctctctctctctct 

241 + + + + + + 300 

agttagagttagagaagagagagagagagagagagagacagagagagagagagagagaga 
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Figure 18-2 

a SISISSLSLSLSLSLSLSLS 

b QSQSLLSLSLiSIjCLSLSLSL- 

c NLNIiFSLSLSLSVSLSIiSLS- 

ctctctctctcgcagggtggggggaagaggaggaggaattctttccccgcctaacatttc 
301 + + + + + + 360 

gagagagagagcgtcccaccccccttctcctcctccttaagaaaggggcggattgtaaag 

a LSLS .QGGGKRRRNSFPA*HF 

b SLtSRRVGGRGGGI LS PPNI S 

c LSLAGWGEEEEEFFPRLTFQ- 

aagggacacaattcactccaagtctcttccctttccaagccgcttccgaagtgctcccgg 
361 + + + + + + 420 

ttccctgtgttaagtgaggttcagagaagggaaaggttcggcgaaggcttcacgagggcc 

a KGHNSLQVSSLSKPLPKCSR 

b RDTIHSKSLPFPSRFRSAPG- 

c GTQFTPSLrFPFQAASEVLPV- 

tgcccgcaactcctgatcccaacccgcgagaggagcctctgcgacctcaaagcctctctt 

421 + + + 4- + + 480 

acgggcgttgaggactagggttgggcgctctcctcggagacgctggagtttcggagagaa 

a CPQLLIPTRE.RSLCDLKASL 

b ARNS * SQPARGASATSKPLF- 

c PATPDPNPREEPLRPQSLSS- 

ccttctccctcgcttccctcctcctcttgctacctccacctccaccgccacctccacctc 

481 + + + 4- + + 540 

ggaagagggagcgaagggaggaggagaacgatggaggtggaggtggcggtggaggtggag 

a PSPSLPSS SCYLHLHRH LHL - 

b LLPRFPPPLATSTSTAT STS- 

C FSLASLLLLLPPPPPPPPPP- 

cggcacccacccaccgccgccgccgccaccggcagcgcctcctcctctcctcctcctcct 

541 + + + + + + 600 

gccgtgggtgggtggcggcggcggcggtggccgtcgcggaggaggagaggaggaggagga 

a RHPPTAAAATGSASSSPPPP 

b GTHPPPPPPPAAPPPLLLLL- 

c APTHRRRRHRQRLLLSS SS S - 
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Figure 18-3 

cccctcttctctttttggcagccgctggacgtccggtgttgatggtggcagcggcggcag 

601 + + + + + + 660 

9999 a 9 aa gagaaaaaccgtcggcgacctgcaggccacaactaccaccgtcgccgccgtc 

a PLFSFWQPLDVRC* WWQRRQ 

b PSSLFGSRWTSGVDGGSGGS 
c PLLFLiAAAGRPVLMVAAAAA- 

cctaagcaacagcagccctcgcagcccgccagctcgcgctcgccccgccggcgtccccag 

661 + + + + + + 720 

ggattcgttgccgtcgggagcgtcgggcggtcgagcgcgagcggggcggccgcaggggtc 

a PKQQQPSQ.PASSRSPRRRPQ 

b LSNSSPRSPPARARPAGVPS- 

C *ATAALAARQLALAPPASPA- 

ccctatcacctcatctcccgaaaggtgctgggcagctccggggcggtcgaggcgaagcgg 
721 + + + + + + 780 

gggatagtggagtagagggctttccacgacccgtcgaggccccgccagctccgcttcgcc 

a PYHLI SRKVLGS SGAVEAKR 

b PITSSPERCWAAPGRSRRSG- 

C LSPHLPKGAGQLRGGRGEAA- 

ctgcagcggccgtagcggcggcgggaggcaggatgagcgcacgcggtgagggcgcggggc 
781 + + + + + + 840 

gacgtcgccgccatcgccgccgccctccgtcctactcgcgtgcgccactcccgcgccccg 

a LQRR*RRREAG*AHAVRARG 
b CSGGSGGGRQDERTR* GRGA 

c AAAVAAAGGRMSARG E GAGQ- 

agccgtccacttcagcccagggacaacctgccgccccagcgcctcagaagagaggacgcg 
841 + + + + + + 90Q 

tcggcaggtgaagtcgggtccctgttggacggcggggtcgcggagtcttctctcctgcgc 

a SRPLQPRDNLPPQRLRREDA 

b AVHFSPGTTCRPSASEERTR- 

C PSTSAQGQPAAPAPQKRGRG- 

gccgccccaggaagcagcagcaagaaccaaccggtgagccctctcctaagagacccaggg 
901 + + + + + + 96Q 

cggcggggtccttcgtcgtcgttcttggttggccactcgggagaggattctctgggtccc 
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Figure 18-4 

a AAPGSSSKNQPVS PLLRDP G 

b PPQEAAARTNR*ALS*ETQG- 

c RPRKQQQEPTGE PSPKRPRG- 

gaagacccaaaggcagcaaaaacaagagtccctctaaagcagctcaaaagaaagcagaag 
951 + + + + + + 1020 

cttctgggtttccgtcgtttttgttctcagggagatttcgtcgagttttctttcgtcttc 

a EDPKAAKTRVPLKQLKRKQK 

b KTQRQQKQESL* SSSKESRS 

c RPKGSKNKS PSKAAQKKAEA- 

ccactggagaaaaacggccaagaggcagacctaggaaatggccacaacaagttgttcaga 

1021 + + + + + + 1080 

ggtgacctctrtttgccggttctccgtctggatcctttaccggtgttgttcaacaagtct 

a PLEKNGQEADLGNGHNKL FR 

b HWRKTAKRQT* EMATTSCSE- 

C TGEKRPRGRPRKWPQQVVQK- 

agaagcctgctcaggaggaaactgaagagacatcctcacaagagtctgccgaagaggact 

1081 + + + + + + 1140 

tcttcggacgagtcctcctttgacttctctgtaggagtgttctcagacggcttctcctga 

a RSLLRRKLKRHPHKSLPKRT 

b EACSGGN* RDI LTRVCRRGL 

c KPAQEETEETSSQESAEED*- 

agggggcgccaacgttcgatttctacctcagcagcagttggatcttttgaagggagaaga 

1141 + + + + + + 1200 

tcccccgcggctgcaagctaaagatggagtcgtcgtcaacctagaaaacttccctcttct 

a RGRQRS I STSAAVGS FEGRR 

b GGANVRFLPQQQLDLLKGE 
C GAPTFDFYLSSSWI F*REK 
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Figure 19-1 NSP-A 

(Linear) MAP of: hsnspa check: 4619 from: 1 to: 3202 

RL ; HSNS PA - Homo sapiens neuroendocrine-specif ic protein A (NSP) mRNA, 

complete 

ID HSNSPA standard; RNA; HUM; 3202 BP. 

AC L10333; 
NI g307306 

DT 16-JUN-1993 (Rel . 36, Created) 

DT 18-JAN-1995 (Rel. 42, Last updated, Version 2) . . . 

With 1 enzymes : NOTI 

ctgagacaccgcagct tccctgagcgccgagtccctccggggacagcagcagggagcgcc 
I + + + + -+ 60 

gactctgtggcgtcgaagggactcgcggctcagggaggcccctgtcgtcgtccctcgcgg 

a LRHRS FPERRVPPGTAAGSA 

b *DTAASLSAESLRGQQQGAP- 

c ET PQLP*APSPSGDSSRERP- 

cgcgcagccaccgagcctctgcccagccaagccgccgtcgccgcgccgggggaccgccag 

61 + + + + + + 120 

gcgcgtcggtggctcggagacgggtcggttcggcggcagcggcgcggccccctggcggtc 

a RAATEPLPSQAAVAAPGDRQ 

b AQPPSLCPAKPPSPRRGTAS- 

c RSHRASAQPSRRRRAGGPPA- 

ccatggccgcgccgggggatccgcaggacgagctgctgccgctggccggccccgggtccc 

121 + + + + + + 180 

ggtaccggcgcggccccctaggcgtcctgctcgacgacggcgaccggccggggcccaggg 

a PWPRRGIRRTSCCRWPAPGP 

b h'graggsagraaaagr PRVP 

c MAAPGDPQDELLPLAGPGSQ- 

a 9 t 99 c tcaggcaccgaggggagggggagaacgaagcggtgacgccgaaaggggccacgc 

181 + + -f + + + 240 

tcaccgagtccgtggctcccctccccctcttgcttcgccactgcggctttccccggtgcg 

a SGSGTEGRGRTKR* RRKGPR 

b VAQAPRGGGERSGDAERGHA- 

C WLRHRGEGENEAVTPKGATP- 
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Figure 19-2 

cggcgccgcaggctggggagcccagcccggggttgggcgccagggcccgggaagcggcgt 

241 + + + + + + 300 

gccgcggcgtccgacccctcgggtcgggccccaacccgcggtcccgggcccttcgccgca 

a RRRRLGSPARGWAPG-PGKRR 

b GAAGWGAQPGVGRQG P GS GV 

c APQAGEPSPGLGARAR EAAS- 

cgcgggaagccggctcgggccccgcccggcagtcgcccgttgccatggaaactgcatcca 

301 + + + + + + 360 

gcgcccttcggccgagcccggggcgggccgtcagcgggcaacggtacctttgacgtaggt 

a RGKPARAPPGSRPLPWKLHP 

b AGSRLGPRPAVARCHGNCIH- 

c REAGSGPARQSPVAMETAS T - 

caggtgtggcaggtgtttccagtgccatggaccacaccttctcaacaacatcaaaagatg 

36! + + + + + + 420 

gtccacaccgtccacaaaggtcacggtacctggtgtggaagagttgttgtagttttctac 

a QVWQVFPVPWTTPSQQHQKM 

b RCGRCFQCHGPHLLNNI KRW 

C GVAGVS SAMDHTFS TTS KD G- 

gggaaggatcgtgttacacatctctcatttctgacatctgctatccacctcaggaggatt 

421 + + + + + + 480 

cccttcctagcacaatgtgtagagagtaaagactgtagacgataggtggagtcctcctaa 

a GKDRVTHLSFLTSAI HLRR I 

b GRIVLHISHF*HLliSTSGGF- 

c EGSCYTSLISDICYPPQEDS- 

ctacatattttactggaattcttcagaaggaaaatggccacgtcaccatttcagagagcc 

481 + + + + + + 540 

gatgtataaaatgaccttaagaagtcttccttttaccggtgcagtggtaaagtctctcgg 

a LHILLEFFRRKMATS PFQRA 

b YIFYWNSSEGKWPRHHFREP- 

c TYFTGILQKENGHVTI SES P - 

ctgaggagctgggtacacccggcccctccttaccagatgtgcctgggatagagtctcgtg 

541 + + + + + + 600 

gactcctcgacccatgtgggccggggaggaatggtctacacggaccctatctcagagcac 
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Figure 19-3 

a- L RSWVHPAPPYQ MCLG*SLV 

b *GAGYTRPLLTRCAWDRVSW~ 

c EELGTPGPSLPDVPGIESRG- 

gcttatttagttctgattctggaatagagatgactcctgcagagtccacggaagtgaaca 

601 + + + + + + 660 

cgaataaatcaagactaagaccttatctctactgaggacgtctcaggtgcctccacttgt 

a AYLVLILE* R* LLQSPRK* T 

b LI*F*FWNRDDSCRVHGSEQ- 

c > LFSSDSGI EMTPAES TEVNK- 

agatcttagcagaccctctggaccagatgaaagcagaggcctataaatacattgacacaa 

661 + + + + + + 720 

tctagaatcgtctgggagacctggtctact t tcgtct ccggatat ttatgtaactgtatt 

a RS*QTLWTR* KQRPINTLtT* 

b DLSRPSGPDESRGL* I H * H N - 

c ILADPLDQMKAEAYKY I D I T - 

ccagacccgaggaggtgaagcaccaagaacaacatcaccccgagctggaagataaagact 

721 + + + + + + 780 

ggtctgggctcctccacttcgtggttcttgttgtagtggggctcgaccttctatttctga 

a PDPRR* S'TKNNITPSWKI KT 

b QTRGGEAPRTTSPRAGR*RL- 

c RPEEVKHQEQHHPELEDKDL- 

tggactttaagaataaagacactgacatctcaattaaacctgaaggagtccgtgaacctg 

781 + + + + + + 840 

acctgaaattcttatttctgtgactgtagagttaatttggacttcctcaggcact tggac 

a WTLRIKTLTSQLNLKESVNL 

b GL*E*RH*HLN*T*RSP*T* 

c DFKNKDTDI S IKPEGVREPD- 

acaaaccagctcctgtggagggaaaaatcatcaaggaccatttattggaagaatccacat 

841 + + + + + + 900 

tgtttggtcgaggacacctccctttttagtagttcctggtaaataaccttcttaggtgta 

a- TNQLLWREKSSRTIYWKNPH 

b QTSSCGGKNHQGPFIGRIHI 

C KPAPVEGKI I KDHLLEESTF- 



WO 98/45322 PCT/IB98/00705 

153/169 



Figure 19-4 

ttgctccatacatagatgatctctctgaagaacagcgcagggctcctcagatcaccaccc 

901 + + + + + + 960 

aacgaggtatgtatctactagagagacttcttgtcgcgtcccgaggagtctagtggtggg 

a LLHT*MISLKNSAGLLRSPP 

b CSIHR* SL*RTAQGS SDHHP 

c APYIDDLSEEQRRAPQI TTP- 

ctgtcaaaatcacactgacggaaatagaaccttctgttgaaaccactacccaagagaaga 

961 + + + + + + 1020 

gacagttttagtgtgactgcctttatcttggaagacaactttggtgatgggttctcttct 

a LSKSH* RK*NLLLKPLPKRR 

b CQNHTDGNRTFC* NHYPRED- 

C VKI TLTE I EPSVETTTQ E K T - 

cccctgagaagcaagatatatgtctaaagccaagtcctgacacagtccccactgtcactg 

1021 + + + + + + 1080 

ggggactcttcgttctatatacagatttcggttcaggactgtgtcaggggtgacagtgac 

a PLRSKIYV*SQVLTQSPL.SL 

b p * EARYMSKAKS*HS PHCHC 

c PEKQDICLKPSPDTVPTVTV- 

tctcggagcctgaagacgacagcccaggatctatcacccctccatcttctggaacagaac 

10 81 + + + + + + I 140 

agagcctcggacttctgctgtcgggtcctagatagtggggaggtagaagaccttgtcttg 

a SRSLKTTAQDLSPLHLLEQN 

b L G A * RRQPRIYHPS I F W N R T - 

C SEPEDDSPGSITPPSSGTEP- 

catctgctgcagaatcccaggggaaaggcagcatctccgaggatgagctgatcaccgcca 

1141 --- + + + + + + 1200 

gtagacgacgtcttagggtcccctttccgtcgtagaggctcctactcgactagtggcggt 

a HLLQNPRGKAASPRMS * S PP 

b I CCRI PGERQHLRG * ADHRH 

c SAAESQGKGS ISEDEL I T A I - 

tcaaagaagcaaagggattatcgtatgaaaccgccgagaacccacggccggtgggccagc 

1201 + + + + + + 1260 

agtttcttcgtttccctaatagcatactttggcggctcttgggtgccggccacccggtcg 
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Figure 19-5 

a SKKQRDYRMKPPRTHGRWAS 

b QRSKGI IV*NRREPTAGGPA- 

c KEAKGLSYETAENPRPVGQIi- 

tggccgacaggcccgaggtcaaggccaggtccggaccgccaaccatccccagccccctgg 

1261 + + + + + + 1320 

accggctgtccgggctccagttccggtccaggcctggcggttggtaggggtcgggggacc 

a WPTGPRSRPGPDRQPSPAPW 
b ' GRQARGQGQVRTANHPQ PPG 

c ADRPEVKARSGPPTI PSPLD- 

accacgaggccagcagcgcggagtcgggggactcagagatcgagctggtgtccgaggacc 

1321 + + + + + + 1380 

tggtgctccggtcgtcgcgcctcagccccctgagtctctagctcgaccacaggctcctgg 

a TTRPAARSRGTQRS S WC PRT 

b PRGQQRGVGGLRDRAGVRGP- 

c HEASSAESGDSE I ELVS EDP- 

ccatggccgcggaggacgcgctgccctcaggctatgtgagctttggccacgtgggcggcc 

1381 + + + + + + 1440 

ggtaccggcgcctcctgcgcgacgggagtccgatacactcgaaaccggtgcacccgccgg 

a PWPRRTRCPQAM*ALATWAA 

b HGRGGRAALRL CELWPRGRP- 

c MAAEDALPSGYVS FGHVGG P - 

cgccgccctcgcccgcctcgccatccatccagtacagcatcctgagggaggagcgcgagg 

1441 + + + + + + 1500 

gcggcgggagcgggcggagcggtaggtaggtcatgtcgtaggactccctcctcgcgctcc 

a RRPRPPRHPSSTAS * GRSAR 

b A ALARLAIHPVQHPEGGARG- 

c PPS PASPSIQYSILREEREA- 

ccgagctggacagcgagctcatcatcgagtcgtgcgacgcctcctcggcctcggaggaga 

1501 + + + + + + 1560 

ggctcgacctgtcgctcgagtagtagctcagcacgctgcggaggagccggagcctcctct 

a PSWTASSSSSRATPPRPRRR 

b RAGQRAHHRVVRRLLGLGGE- 

C ELDSELI IESCDASSASEES- 
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Figure 19-6 

gccccaagcgggagcaggactcacccccgatgaagcccagcgccctggatgccatccggg 

1561 + + + + + + !620 

cggggttcgccctcgtcctgagtgggggctacttcgggtcgcgggacctacggtaggccc 

a APSGSRTHPR*SPAPWMPSG 

b PQAGAGLTPDEAQRPGCHPG 

c PKREQDS PPMKP SALDA I R E - 

aggagactggcgtccgggccgaggagcgtgcgccaagccggcggggcctggccgagccgg 

1621 + + + + + + 1680 

tcctctgaccgcaggcccggctcctcgcacgcggttcggccgccccggaccggctcggcc 

a RRLASGPRSVRQAGGAWPSR 

b GDWRPGRGACAKPAGPGRAG 

c ETGVRAEERA P S RRGLAE P G - 

gttccttcctcgactacccctcaactgagccccagcctggccccgagctgccccctggag 

1681 + + + + + + 1740 

caaggaaggagctgatggggagttgactcggggtcggaccggggctcgacgggggacctc 

a VPSSTTPQLSPSLAPSCPLE 

b FLPRLPLN*APAWPRAAPWR- 

c SFLDYPSTEPQPGPELPPGD- 

acggagccctggagcctgagacgcccatgttgccacggaagcctgaagaagactcgagtt 

1741 + + + + + + 1800 

tgcctcgggacctcggactctgcgggtacaacggtgccttcggacttcttctgagctcaa 

a TEPWSLRRPCCHGSLKKTRV 

b R S P G A * D A H V A T E A * R R L E F - 

c GALEPETPMLPRKPEEDSSS- 

ccaaccaaagtcctgcggccacaaagggccctgggcctctaggtcctggcgccccgcccc 

1801 + + + + + 1860 

ggttggtttcaggacgccggtgtttcccgggacccggagatccaggaccgcggggcgggg 

a PTKVLRPQRALGL*VLAPRP 

b QPKSCGHKGPWASRSWRPAP- 

c NQSPAATKGPGPLGPGAPPP- 

cactgctgtttctcaataagcaaaaagctattgacctgttgtattggcgggacatcaagc 

1861 + + + + + + 1920 

gtgacgacaaagagttattcgtttttcgataactggacaacataaccgccctgtagttcg 
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Figure 19-7 

a HCCFSISKKLLTCCIGGTSS 
b TAVSQ*AKSY * PVVLAGHQA 

c LLFLNKQKA I DLLYWRD I KQ- 

agacgggcatcgtgtttgggagtttcctgctgctgctcttctccctgacccagttcagcg 

1921 + + + + + --- + 1980 

tctgcccgtagcacaaaccctcaaaggacgacgacgagaagagggactgggtcaagtcgc 

a RRASCLGVSCCCSSP*PSSA 
b DGHRVWEFPAAALLPDPVQR 
c TGIVFGSFLLLLFS LTQFSV- 

tggtgagcgtcgtggcctacctggccctggccgcactctcagccaccatcagtttccgca 

1981 -t + + + + + + 2040 

accacrcgcagcaccggatggaccgggaccggcgtgagagtcggtggtagtcaaaggcgt 

a W*ASWPTWPWPHSQPPSVSA 

b GERRGLPGPGRTLSHHQFPH- 

c VSVVAYLALAALSATISFRI- 

tctacaagcctgttttacaagcagtgcagaaaaccgacgaaggccaccctttcaaggcct 

2041 + + + + + + 2100 

agatgttcagacaaaatgttcgtcacgtcttttggctgcttccggtgggaaagttccgga 

a STSLFYKQCRKPTKATLSRP 
b LQVCFTS SAEN RRRP PFQ GL 

c YKSV LQAVQKTDEGH P F KAY- 

acttggagcttgagatcaccctttctcaggagcagattcagaagtacacggactgcctgc 

2101 + + + + + + 2160 

tgaacctcgaactctagtgggaaagagtcctcgtctaagtcttcatgtgcctgacggacg 

a TWSLRSPFLRSRFRSTRTAC 

b LGA*DHPFSGADSEVHGLPA- 

c LELEITLS QEQIQKYTDCLQ- 

agttctacgtgaacagcacacttaaggaactgaggaggctcttccttgtccaggacctgg 

2161 + + + + + + 2220 

tcaagatgcacttgtcgtgtgaattccttgactcctccgagaaggaacaggtcctggacc 

a SST*TAHLRN*GGSSLSRTW 
b VLREQHT*GTEEALPCPGPG 
c FYVNSTLKELRRLFLVQDLV- 
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Figure 19-8 

tggattccttaaaatttgcagtcctgatgtggctcGtgacctacgttggcgctctcttca 

2221 + + + + + + 2280 

acctaaggaattttaaacgtcaggactacaccgaggactggatgcaaccgcgagagaagt 

a WIP*NLQS*CGS*PTLA'LSS 

b G FLKI CS PDVAPDLRWR S LQ 

c DSLKFAVLMWLLTYVG A L F N - 

atggcctgaccctgctgctcatggctgtggtttcaatgtttactctacctgtagtgtatg 

2281 + + + + + + 2340 

taccggactgggacgacgagtaccgacaccaaagttacaaatgagatggacatcacatac 

a MA*PCCSWLWFQCLLYL*CM 

b WPDPAAHGCGFNVYSTCSVC- 

c GLTLL LMAVVS M FT L P VV Y V- 

ttaagcaccaggcacagattgaccaatatctgggacttgtgaggactcacataaatgctg 

2341 + + + + + + 2400 

aattcgtggtccgtgtctaactggttatagaccctgaacactcctgagtgtatttacgac 

a LSTRHRLTNIWDL*GL>T * ML 

b *APGTD*PISGTCEDSHKCC- 

c KHQAQI DQYLGLVRTH I NAV- 

ttgtggcaaagattcaggctaaaatcccaggcgctaagaggcacgctgagtaaactgatt 

2401 + -f - - + + + + 2460 

aacaccgtttctaagtccgattttagggtccgcgattctccgtgcgactcatttgactaa 

a LWQRFRLKSQALRGTLSKLI 

b CGKDSG*NPRR* E A R * V N * F 

c VAKIQA KI PGAKRHAE * TD F - 

tcccaccggggactggacacaaacaggaatgtctggagtggtaacagctctct tcttact 

2461 + + + + + + 2520 

agggtggcccctgacctgtgtttgtccttacagacctcaccattgtcgagagaagaatga 

a SHRGLDTNRNVWSGNSSLLT 

b PTGDWTQTG MSGVVTAL FLL 

c PPGTGHKQECLEW * QLS S Y S- 

cattactgcaaattgattgtctttcccccctccctccagtaccataatcttagagacaaa 

2521 + + + + + + 2580 

gtaatgacgtttaactaacagaaaggggggagggaggtcatggtattagaatctctgttt 
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Figure 19-9 

a HYCKLiIVFPPSLQYHNLRDK, 

b ITAN*-LSFPPPSSTIILETN- 

c LLQIDCLSPLPPVP*S*RQT- 

ccttaaaacagctgtttttaggctgttccttgtactcttaggatatttgagtcacttgtg 

2581 + + + + + + 2640 

ggaattttgtcgacaaaaatccgacaaggaacatgagaatcctataaactcagtgaacac 

a p*NSCF*AVPCTLRIFESLV 

b LKTAVFRLFLVLLGYLSHLC- 

C LKQLFLGCSLYS * D I * VTCV- 

tcaaccactaaagtatagagaaaagtgtattagatgtggttt ttaatt t tgtgttgctaa 

2641 + + + + + + 2700 

agttggtgatttcatatctcttttcacataatctacaccaaaaattaaaacacaacgatt 

a STTKV*RKVY*MWFLILCC* 

b QPLKYREKCIRCGF* FCVAK- 

c NH* S I EKSVLDVVFNFVLLK- 

aaaaagtgcatgatggtgagagcccaagttatctttccctcttcggtgttcttcttctct 

2701 + + + + + + 2760 

tttttcacgtactaccactctcgggttcaatagaaagggagaagccacaagaagaagaga 

a KKCMMVRAQVIFPSSVFFFS 

b KSA*W*EPKLSFPLRCS.SSL- 

c KVHDGES PSYLSLFGVLLLF- 

tctctgcaatgcttctgtagcttctaatgttccccgtggctaggcctttcctgccgagtg 

2761 + + + + + + 2820 

agagacgttacgaagacatcgaagattacaaggggcaccgatccggaaaggacggctcac 

a SLQCFCSF*CSPWLGLSCRV 

b L CNASV'ASNVPRG* AFPAE C 

C SAM L L * L LM F PVA R P F L P S A- 

ctctgatgcaatagtggaaatcgcttatatgtccttgggttgctggttggattaatcttt 

2821 + + + + + + 2880 

gagactacgttatcacctttagcgaatatacaggaacccaacgaccaacctaattagaaa 

a L*CNSGNRIiYVLGLLVGL,IF 

b SDAIVEIAYMSLGCWLD* SL- 

c LMQ*WKSLICPWVAGWINL*- 
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Figure 19-10 

aataacaatatatagaattgtagactgatgttttagcatttttccaacacacacaacgta 
2881 + + + + + + 2940 

ttattgttatatatcttaacatctgactacaaaatcgtaaaaaggttgtgtgtgttgcat 

a NNNI*NCRLMF*HFSNTHNV 

b ITIYRIVD*CFSIFPTHTT* 

c * QYI EL* TDVLAF FQHTQRK- 

aaaataaaagcagtcgaccgcacttatggtaatcagttttgtataacttaaaataattaa 
2941 - --- + + + + + + 3000 

, ttttattttcgtcagctggcgtgaataccattagtcaaaacatattgaattttattaatt 

a KIKAVDRTYGNQFC I T * N N * 

b K*KQSTALMVISFV*LKI IK- 

c NKSSRPHLW * SVLYNLK * 'L N - 

ataaatgaataaatccaaaacaaacatgcagtacttttgttgtatgggattggtgggctg 
3001 + '■ + + + + . + 3060 

tatttacttatttaggttttgtttgtacgtcatgaaaacaacataccctaaccacccgac 

a I N E * IQNKHAVLLLYGIGGL 

b *MNKSKTNMQYFCCMGLVG* 

c K*INPKQTCSTFVVWDWWAD- 

atttacatgtatggttactaaaaagtaccagcatgttaactttattacaatttgtattac 

3061 - + + + + + + 3120 

taaatgtacataccaatgatttttcatggtcgtacaattgaaataatgttaaacataatg 

a I YMYGY * KV PAC * LYYNLY Y 

b FTCMVTKKYQHVNF I T I C I T 

c LHVWLLKS TSMLTLLQF. VLL- 

tttctctgtagttcctaatggattcaattacggactctggatatttgcacttatgtactt 
3121 + + + + + + 3180 

aaagagacatcaaggattacctaagttaatgcctgagacctataaacgtgaatacatgaa 

a FLCSS *WIQLRTLiDI CTYVL 

b FSVVPNGFNYGLWI FALMYLi- 

c SL* FLMDS ITDSGYLHLCT * - 

gatactgaatgcataaataaat 

3181 + - 3202 

ctatgacttacgtatttattta 
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Figure 19-11 

a DTECINK 
b I L N A * I N 

C Y * M H K * 

Enzymes that do cut : 

NONE 

Enzymes that do not cut : 
Not I 
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Figure 21-1 Restriction sites generated by dinucleotide deletions in tran- 
scripts of 6 amyloid precursor protein ((iAPP) and Ubiquitin B 

In general, a mutation in the nucleotide sequence can result in changes in the 
restriction enzyme recognition sites of the sequence. The GA deletion in exon 
9 of B-APP and the GT deletion in the first repeat of of Ubi-B do not alter the 
restriction map of the sequence down- and upstream the deletion. However, due 
to the GA deletion in exon 10 of (J-APP an Msl-I site is created a: the site of 
Che deletion- (Fig. 21-2,-3,-4). 

In the Ubi-a sequence the CT deletion in the second repeat leads to the loss 
of a Hin4-I and BstX-I site and the creation of a Cje-I site upstream and the 
creation of a BsR-I and a TspR-I site downstream the deletion sice (Fig 21-5- 
8) . 
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Figure 21-2 
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Enzymes that do cut: 
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Figure 21-3 

(Linear) MAP of: appex9 AGA check: 8734 from: 1 to: 66 
GA-deletie exon 9: no difference in restriction map with wild type (Fig. 21-2) 
With 224 enzymes: * 

March 10, 1998 11:57 
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Figure 21-4 

(Linear) MAP of: appexlO AGA check: 8854 from: 1 to: 66 
GA-deletie exon 10: no difference in restriction map with wild type (Fig. 2X-2) 
With 224 enzymes: * 

March 10, 1998 11:57 
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Figure 21-5 

(Linear) MAP of: ubiwtl check: 3896 from: 1 to: 34 



With 224 enzymes: * 



March 10, 1998 11:07 
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Figure 21-6 

(Linear) MAP of: ubi A gt check: 8822 from: 1 to: 32 
With 224 enzymes: * 

March 10 t 1998 11:08 . . 
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Figure 21-7 

(Linear) MAP of: ubiwt2 check: 8814 from: 1 to: 28 



With 224 enzymes: * 

March 10, 1998 11:08 
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Figure 21-8 

(Linear) MAP of: ubi A ct check; 4758 from; 1 to: 26 



With 224 enzymes: * 

March 10 f 1998 11:08 
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Bpml 


BpulOI 


Bpull02I 


Bsal 


BsaAI 


BsaBI 


BsaHI 


BsaJI 


BsaWI 


BsaXI 


Bsbl 


BscGI 


BseMIl 


BseRI 


BseSI 


Bsgl 


BsiEI 


BsiHKAI 


BslI 


BsmI 


BsmAI 


BsmBI 


BsmFI 


Bsp24l 


Bsp24I 


Bspl286I 


BspEI 


BspGI 


BspLUllI 


BspMI 


BarBI 


BsrDI 


BsrFI 


BsrGI 


BssHII 


BssSI 


Bst4CI 


BstAPI 


BstDSI 


BstEII 


BstXI 


BstYI 


BstZ17I 


B3U36I 


Btsl 


CacBI 


CjePI 


CjePI 


Clal 


CviJI 


CviRI 


Ddel 


Dpnl 


Dral 


Drain 


DrdI 


DrdI I 


Eael 


EagI 


Earl 


Ecil 


Eco47III 


ECOS7I 


EcoNI 


EcoOlOSI 


EcoRI 


EcoRII 


EcoRV 


Faul 


Fnu4Hl 


Fokl 


Fsel 


Fspl 


Gdill 


Hael 


Haell 


Haelll 


HaelV 


Hgal 


HgiEII 


Hhal 


Hin4I 


Hindi 


Hindi I I 


Hinfl 


Hpal 


HphI 


Kpnl 


Maell 


Maelll 


MboII 


Mlul 


Mmel 


MscI 


Msel 


MSlI 


Mspl 


MspAlI 


Muni 


Mwol 


Narl 


Neil 


Ncol 


Ndel 


NgoAIV 


Nhel 


Nlalll 


NlalV 


NotI 


Nrul 


Nsil 


Nspl 


MspV 


Pad 


PflllOBI 


PflMI 


PinAI 


Plel 


Pmel 


Pmll 


PshAI 


PspSII 


PstI 


Pvul 


Pvul I 


Real 


RleAI 


Rsal 


RsrII 


SacI 


SacII 


Sail 


SanDI 


Sapl 


Sau3AI 


Sau96I 


Sbfl 


Seal 


ScrFI 


SexAI 


SfaNI 


Sfcl 


Sfil 


Sgfl 


SgrAI 


SimI 


Smal 


Smll 


SnaBI 


Spel 


SphI 


Srfl 


Sse8647I 


Sspl 


Sthl32I 


StuI 


Styl 


SunI 


Swal 


TaqI 


Taqll 


TaqI I 


TatI 


Taul 


Tfil 


Thai 


Tsel 


Tsp45I 


Tsp509I 


Tthllll 


Tthlllll 


UbaLI 


Vspl 


Xbal 


Xcml 



Xhol XmnI 



